tensorflow / datasets

TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
https://www.tensorflow.org/datasets
Apache License 2.0
4.27k stars 1.53k forks source link

[data request] a2d2 #2090

Open nikste opened 4 years ago

nikste commented 4 years ago

Autonomous driving dataset by audi. Contains lidar and Images with semantic segmentation labels

The dataset includes more than 40,000 frames with semantic segmentation image and point cloud labels, of which more than 12,000 frames also have annotations for 3D bounding boxes. In addition, we provide unlabelled sensor data (approx. 390,000 frames) for sequences with several loops, recorded in three cities.

It features 41,280 frames with semantic segmentation in 38 categories. Each pixel in an image is given a label describing the type of object it represents, e.g. pedestrian, car, vegetation, etc.

Point cloud segmentation is produced by fusing semantic pixel information and LiDAR point clouds. Each 3D point is thereby assigned an object type label. This relies on accurate camera-LiDAR registration.

c3D bounding boxes are provided for 12,499 frames. LiDAR points within the field of view of the front camera are labelled with 3D bounding boxes. We annotate 14 classes relevant to driving, e.g. cars, pedestrians, buses, etc.

Folks who would also like to see this dataset in tensorflow/datasets, please thumbs-up so the developers can know which requests to prioritize.

nikste commented 4 years ago

I'm taking a look a this and its not clear to me if datasets are grouped by task or input type (there seem to be some dataset implementations for object_detection but also for image)

Additionally this dataset contains: as input: Lidar + images as targets: semantic segmentation + bounding boxes (plus lots of other configuration data)

Is it better to implement this as 2 separate datasets, or one? (I'm planning to use this for doing some learning on lidar+image, so for the network i would want both inputs (synchronized) at one point) Is there a channel for these sort of questions ? I could not find anything.

floydium commented 4 years ago

If I understand this correctly, there are three datasets:

  1. Semantic Segmentation: Input: camera + lidar + bus Target: 38 class semantic segmentation + instance masks for traffic participant classes Size: 41,277

  2. Object Detection: Input: camera + lidar + bus Target: 38 class semantic segmentation + instance masks for traffic participant classes + 3D bounding boxes for lidar Size: 12,497

  3. Unlabeled Sequences: Input: camera + lidar + bus Target: No labels Size: ~400k

Object Detection dataset is a subset of the Semantic Segmentation dataset.

Ref: https://arxiv.org/abs/2004.06320

nikste commented 4 years ago

Sure, we can view it like that. I think for my use case it makes sense to focus on the semantic segmentation part first if this split makes sense. Randomly pinging @Conchylicultor, maybe you can comment?

rtayek commented 4 years ago

maybe relevant: ADE20K dataset.