The ground truth bounding boxes seems not aglin well with objects in 1 Mpx Dataset

JDYG commented 3 months ago

Hi, I found that in the 1 Mpx dataset, the provided annotations are offset from the actual positions of the objects, especially for objects at the edges of the images, as shown in Fig.1 below. This miss aglinement occurs on almost all samples.

What could be the possible reasons for this? It feels like the calibration isn't done between the DVS and RGB camera.

Another question about 1Mpx dataset is there are no annotations for some objects, i.e., annotations are missing for some objects, as shown in Fig.2 below. I think it will be a huge problem for object detection training, as the features that should be objects will be trained as background.

Following is my code for visualization

import cv2
from src.io.psee_loader import PSEELoader
from src.visualize import vis_utils as vis

events_fname = '/Datasets/GEN4/trainfilelist00/train/moorea_2019-02-19_002_td_183500000_243500000_td.dat'
bbox_fname = events_fname[:-7] + '_bbox.npy'

video = PSEELoader(events_fname)
box_video = PSEELoader(bbox_fname)

height, width = video.get_size()
print(height, width)
labelmap = vis.LABELMAP if height == 240 else vis.LABELMAP_LARGE

cv2.namedWindow('out', cv2.WINDOW_NORMAL)
while not video.done:
    delta_t = 20e3
    events = video.load_delta_t(delta_t)
    box_events = box_video.load_delta_t(delta_t)
    im = vis.make_binary_histo(events, img=None, width=width, height=height)
    vis.draw_bboxes(im, box_events, labelmap=labelmap)
    cv2.imshow('out', im)
    cv2.waitKey(10)
    if not box_events.size == 0:
        cv2.waitKey(0)
        pass
cv2.destroyAllWindows()

Fig.1 The miss-aglined results is shown below:

Fig.2 Example for missing annotations is shown below. There should be bounding box for pedestrians where indicated by green arrows, but missing. You can reproduce this situation in GEN4/trainfilelist05/train/moorea_2019-06-21_000_2013500000_2073500000_td.dat

lbristiel-psee commented 3 months ago

hello,

thanks for your feedback. What you are describing is a know issue of the automatic labeling tool that was used. This issue/limitation is described in the section 5.3 of the paper: "Learning to Detect Objects with a 1 Megapixel Event Camera"

Note that these outliers should be a small portion compared to the number of labels in the full dataset so the dataset can still be used and gives some decent results.

Hope this clarifies, Laurent for Prophesee Support

JDYG commented 3 months ago

Thanks for your reply. That solves my concern.

prophesee-ai / prophesee-automotive-dataset-toolbox

The ground truth bounding boxes seems not aglin well with objects in 1 Mpx Dataset #40