Closed magehrig closed 2 years ago
Hello Mathias,
These line of code suggest that if 'GEN1' is used, the min_box_side parameter is larger than for 'GEN4'. The GEN4 resolution is much higher, which is why I am a bit confused about this choice. Is there a specific reason for this?
Indeed, there might be a problem in the code. Specially if we look at the doc of the filter_boxes function where min_box_side default value is 20 and we write it is for Gen4.
def filter_boxes(boxes, skip_ts=int(5e5), min_box_diag=60, min_box_side=20):
"""Filters boxes according to the paper rule.
To note: the default represents our threshold when evaluating GEN4 resolution (1280x720)
Unfortunately, I was not able to go to the bottom of this topic. I will try to dig into it and fix the code if needed. In any case it is not critical. The important filtering is done with the diagonal size. The additional filtering with the side is done to avoid degenerate cases (e.g. bounding boxes with very large aspect ratio).
In your Neurips paper, there is no mention of the min_box_side filter. Was this filter used for evaluation? If yes, which value was used?
Yes, we do filter them with diagonal smaller than 60pix and side smaller than 10pixels. This is not explicitely mentioned in the main text, but it explained in the official evaluation code, which is linked in the paper (bottom of page 6). Please see: https://github.com/prophesee-ai/prophesee-automotive-dataset-toolbox#disclaimer-new-dataset
In the GEN1 dataset I noticed that some bounding boxes are partially outside the camera frame. Is this intentional? (I don't remember in which files this was the case).
Yes, the instructions for the manual annotators were to label objects as they would appear at full size. This means that if an object is occluded by another object in front of it, or if it is partially outside the field of view, the size of the object should be the size of the object as if it was fully visible. (edited)
Hope this helps, Laurent
Thanks for your answers. So for now I assume that
Just to get it right: "Yes, we do filter them with diagonal smaller than 60pix and side smaller than 10pixels." you mean "side smaller than 20 pixels" for GEN4, correct?
Let me know if you can confirm or correct my assumptions.
Hello Mathias, the code we used is the one online, so if you want to measure KPIs you should stick to it. Now, like mentioned, the values we used for those filters are maybe not the right one, so feel free to adapt/change it (with the values you mention indeed) if you want to do your own training.
Sorry to come back to this but now I looked at the labels in the GEN4 dataset and found labels that are completely out of the field of view.
An example file is: test/moorea_2019-02-22_000_td_1342500000_1402500000_bbox.npy
where the corresponding label to index 38249 is:
(t, x, y, w, h, class_id, class_confidence, track_id) = (52286973, -407.73276, 387.09305, 190.46196, 84.591034, 0, 0.918454, 9570)
This bounding box is completely out of the field of view. I first thought that this must be an error, but I found many such labels in the dataset.
Needless to say, it does not make sense to regress labels that are completely outside the field of view. How did you handle this in the NeurIPS work both for training and testing?
I'll check the bounding box issue asap
if I remember correctly those boxes are filtered, thus ignored. as explained above we filter them if the box inside the frame (cropping the part outside) is below a threshold. I think it was 10 or 20 pixels.
ok, that seems reasonable. I am going to close it for now.
I have have questions about the default filtering of bbox labels and bbox labels in general:
Thank you for your work!