[data request] a2d2 - Githubissues

nikste commented 4 years ago

Name of dataset: a2d2
URL of dataset: https://www.a2d2.audi/a2d2/en.html
License of dataset: CC BY-ND 4.0
Short description of dataset and use case(s):

Autonomous driving dataset by audi. Contains lidar and Images with semantic segmentation labels

The dataset includes more than 40,000 frames with semantic segmentation image and point cloud labels, of which more than 12,000 frames also have annotations for 3D bounding boxes. In addition, we provide unlabelled sensor data (approx. 390,000 frames) for sequences with several loops, recorded in three cities.

It features 41,280 frames with semantic segmentation in 38 categories. Each pixel in an image is given a label describing the type of object it represents, e.g. pedestrian, car, vegetation, etc.

Point cloud segmentation is produced by fusing semantic pixel information and LiDAR point clouds. Each 3D point is thereby assigned an object type label. This relies on accurate camera-LiDAR registration.

c3D bounding boxes are provided for 12,499 frames. LiDAR points within the field of view of the front camera are labelled with 3D bounding boxes. We annotate 14 classes relevant to driving, e.g. cars, pedestrians, buses, etc.

Folks who would also like to see this dataset in tensorflow/datasets, please thumbs-up so the developers can know which requests to prioritize.

nikste commented 4 years ago

I'm taking a look a this and its not clear to me if datasets are grouped by task or input type (there seem to be some dataset implementations for object_detection but also for image)

Additionally this dataset contains: as input: Lidar + images as targets: semantic segmentation + bounding boxes (plus lots of other configuration data)

Is it better to implement this as 2 separate datasets, or one? (I'm planning to use this for doing some learning on lidar+image, so for the network i would want both inputs (synchronized) at one point) Is there a channel for these sort of questions ? I could not find anything.

floydium commented 4 years ago

If I understand this correctly, there are three datasets:

Semantic Segmentation: Input: camera + lidar + bus Target: 38 class semantic segmentation + instance masks for traffic participant classes Size: 41,277
Object Detection: Input: camera + lidar + bus Target: 38 class semantic segmentation + instance masks for traffic participant classes + 3D bounding boxes for lidar Size: 12,497
Unlabeled Sequences: Input: camera + lidar + bus Target: No labels Size: ~400k

Object Detection dataset is a subset of the Semantic Segmentation dataset.

Ref: https://arxiv.org/abs/2004.06320

nikste commented 4 years ago

Sure, we can view it like that. I think for my use case it makes sense to focus on the semantic segmentation part first if this split makes sense. Randomly pinging @Conchylicultor, maybe you can comment?

is there another channel for quick stupid questions?
does spliting the dataset in this way make sense (and can i cross reference detection and segmentation in that case?) , or should it better be joined in that case?
how would this dataset be classified (no pun intended) (image, object detection, open a new category?)

rtayek commented 4 years ago

maybe relevant: ADE20K dataset.

tensorflow / datasets

[data request] a2d2 #2090