DSD-DBS / raillabel

A devkit for working with recorded and annotated train ride data from Deutsche Bahn.
Apache License 2.0
20 stars 7 forks source link

Temporal consistency of UUIDs and pose question #15

Closed fferroni closed 1 year ago

fferroni commented 1 year ago

Hi,

A few questions on this dataset:

  1. Are per-frame labels associated over sequential frames by a key, i.e.? It doesn't appear to be the case.
    
    import raillabel

path = "~/raillabel/2_station_berliner_tor_2.1/2_station_berliner_tor_2.1_labels.json" scene = raillabel.load(path, validate=True)

frame_id = 250 s1 = [annotation for annotation in scene.frames[frame_id].annotations.values() if annotation.sensor.uid == "rgb_center"] s2 = [annotation for annotation in scene.frames[frame_id].annotations.values() if annotation.sensor.uid == "lidar"] s3 = [annotation for annotation in scene.frames[frame_id+1].annotations.values() if annotation.sensor.uid == "rgb_center"] across_modality = set([s.uid for s in s1]).intersection([s.uid for s in s2]) across_time = set([s.uid for s in s1]).intersection([s.uid for s in s3])

Both `across_modality` and `across_time` are empty. Is there a "ground truth" way of associating labels, or does this need to be done via IoU or some heuristic method? (i.e. for tracking tasks)

2. What is the coordinate system for the IMU data? 
The scene object does not yield any information regarding the IMU/GNSS. What are the extrinsics of the IMU w.r.t to i.e. the LIDAR coordinate system? I could not find this information in the ArXiV paper either (https://arxiv.org/pdf/2305.03001.pdf). That should be necessary if one wants to i.e. aggregate the LIDAR frames.

scene.sensors.keys() dict_keys(['ir_center', 'ir_left', 'ir_right', 'lidar', 'radar', 'rgb_center', 'rgb_highres_center', 'rgb_highres_left', 'rgb_highres_right', 'rgb_left', 'rgb_right'])



Thank you
tklockau commented 1 year ago

Regarding 1: I think you are confusing the annotation ID with the object ID. raillabel.format.Frame.annotations is just the combination of all object_data in the frame. What you are looking for is raillabel.format.Frame.object_data. The keys of this dict correspond to a real life object, that can be found in scene.objects. This ID can be used for tracking and extracting the class information about the object.

import raillabel

path = "~/raillabel/2_station_berliner_tor_2.1/2_station_berliner_tor_2.1_labels.json"
scene = raillabel.load(path, validate=False)

object_id = '175d5ae3-b6aa-4281-aa74-452371dad235'  # Tracking ID of the object 'person0001'

annotations_of_person0001_in_frame_250 = list(scene.frames[250].object_data[object_id].annotations.values())
annotations_of_person0001_in_frame_251 = list(scene.frames[251].object_data[object_id].annotations.values())

By the way you can also use raillabel.filter() for excluding all annotations, that are not part of an object you want

import raillabel

path = "~/raillabel/2_station_berliner_tor_2.1/2_station_berliner_tor_2.1_labels.json"
scene = raillabel.load(path, validate=False)

object_id = '175d5ae3-b6aa-4281-aa74-452371dad235'  # Tracking ID of the object 'person0001'

scene_only_containing_annotations_of_person0001 = raillabel.filter(
    scene,
    include_object_ids=[object_id]
)

Also: you don't need set validate=True when calling raillabel.load(), because all official scenes are valid unless they have been modified.

Hope that answer helped you!

tklockau commented 1 year ago

Regarding 2: I created a new question to answer that here: https://github.com/DSD-DBS/raillabel/issues/18

romantilly commented 1 year ago

As @tklockau already explained, objects that are captured by different sensors or in multiple frames across time are identified by their object ID. Here is an example that produces the lists of annotations across_modality and across_time that you were aiming for:

import raillabel
path = "~/raillabel/2_station_berliner_tor_2.1/2_station_berliner_tor_2.1_labels.json"
scene = raillabel.load(path, validate=False)
frame_id=250

# list of all objects in this frame
objects = list(scene.frames[frame_id].object_data.keys())

# select one object
my_object = objects[10]

# annotations for this object across modalities (i.e., sensors)
#  for better readability, only print UIDs of annotations
#  annotations for the same object in one frame are associated with the object
annotations = scene.frames[frame_id].object_data[my_object].annotations
list(annotations.keys())

#>    ['485f2ec4-e224-4204-bc10-8a599e3ad8e2',
#>     'b9540576-1c9b-48df-a0ef-04be6d1e4ef5',
#>     'ca8c7e12-78b9-43fc-908d-b6274dcb91d2',
#>     'aca6c3ea-6479-4d80-86a2-524f8500955c',
#>     '54bfb1cd-7af4-42c3-9828-18fa70dd9abb',
#>     '3227079f-a78e-428e-8f32-7cb6cb0d0120']

# show details about annotations
for key in annotations:
    print(f'{key} is a {type(annotations[key]).__name__} in sensor {annotations[key].sensor.uid}')

#>    485f2ec4-e224-4204-bc10-8a599e3ad8e2 is a Bbox in sensor ir_center
#>    b9540576-1c9b-48df-a0ef-04be6d1e4ef5 is a Bbox in sensor rgb_center
#>    ca8c7e12-78b9-43fc-908d-b6274dcb91d2 is a Bbox in sensor rgb_highres_center
#>    aca6c3ea-6479-4d80-86a2-524f8500955c is a Bbox in sensor radar
#>    54bfb1cd-7af4-42c3-9828-18fa70dd9abb is a Cuboid in sensor lidar
#>    3227079f-a78e-428e-8f32-7cb6cb0d0120 is a Seg3d in sensor lidar

# annotations for this object in the consecutive frame
#  for better readability, only print UIDs of annotations
annotations = scene.frames[frame_id+1].object_data[my_object].annotations
list(annotations.keys())

#>    ['97fb4731-631e-4c41-9209-09de36c9b5a2',
#>     'fd305e4d-338e-491c-8e27-b74b1bf04ca4',
#>     '7081ba26-7dab-42aa-8c0f-caf8703b3ccd',
#>     '76e419a5-d81f-4097-bb79-e20d2dc30a34',
#>     '75ae29dd-04d1-47ea-990f-f1ce33745b5d',
#>     '58872065-29ea-4370-9e7d-6815d9ef0db5']

# show details about annotations
for key in annotations:
    print(f'{key} is a {type(annotations[key]).__name__} in sensor {annotations[key].sensor.uid}')

#>    97fb4731-631e-4c41-9209-09de36c9b5a2 is a Bbox in sensor ir_center
#>    fd305e4d-338e-491c-8e27-b74b1bf04ca4 is a Bbox in sensor rgb_center
#>    7081ba26-7dab-42aa-8c0f-caf8703b3ccd is a Bbox in sensor rgb_highres_center
#>    76e419a5-d81f-4097-bb79-e20d2dc30a34 is a Bbox in sensor radar
#>    75ae29dd-04d1-47ea-990f-f1ce33745b5d is a Cuboid in sensor lidar
#>    58872065-29ea-4370-9e7d-6815d9ef0db5 is a Seg3d in sensor lidar