CARLA object detection dataset has incorrect ground truth labels

Lodour commented 2 years ago

I saved some images and bounding boxes from the CARLA Object Detection dataset, then I found that many ground truth boxes and labels are incorrect.

I extract data using the following two functions

from armory.data.adversarial_datasets import carla_obj_det_dev
from armory.data.datasets import carla_obj_det_train

See below for some examples of incorrect labels.

I have added the provided bounding boxes, whose labels are indicated by different colors:

Red: 1 Pedestrian
Green: 2 Vehicle
Blue: 3 Traffic lights
White: 4 Green screen

dev split, index 20, the vehicle is labeled as traffic lights (blue). dev-020-rgb dev-020-depth

train split, index 100, the vehicle is labeled as pedestrain (red), many small random bounding boxes. train-100-depth train-100-rgb

yusong-tan commented 2 years ago

@Lodour this is a shortcoming that we are aware of in CARLA. When multiple objects overlap (i.e., connected pixel-wise), CARLA will treat them all together as one class and produce one semantic segmentation. We currently cannot always separate out the individual objects that overlap. We are working on addressing this problem. Relatively, this doesn't happen very often in our training data and shouldn't significantly impact training.

In your examples, either the cars overlap with background traffic lights, or cars overlap with pedestrians. In both cases, CARLA will pick one class and label all the overlapping objects as the same class.

Lodour commented 2 years ago

Thanks for the clarification.

twosixlabs / armory

CARLA object detection dataset has incorrect ground truth labels #1206