Closed trinhxuankhai closed 9 months ago
Sorry, it was my mistake; I was able to resolve the issue after switching to the ffmpeg library for frame extraction. There may be a bug when using the provided code for frame extraction using the opencv library.
In the training set of the BDD_PC_5K dataset, I found that there are some incorrect bounding box annotations. In particular, for video 534 in the training set, the caption for the pedestrian refers to "The pedestrian, a male in his 30s with a height of 190 cm or more ..." but compared to the visualization of the bounding box of the first frame of this prerecognition phase, it seems to not correctly align with the referring pedestrian.