Incorrect bounding box annotation ?

Hello, I have a few questions concerning the dataset's annotation.

Is the code provided for frame extraction under this repository correct? In the training set of the BDD_PC_5K dataset, I found that when using OpenCV and FFmpeg library for frame extraction, the result is not the same in some videos. In particular, for video 534, I use the bounding box provided for frame with id 1171, the visualization for each library is as follows:
- OpenCV:
- FFmpeg:
I also noticed several bounding boxes that appear to be inaccurate; is this considered noise or is there an issue in the labeling process? In the training set of the BDD_PC_5K dataset, I found the bounding box of video 170 with frame id 384 appeared to refer to no object:
- OpenCV:
- FFmpeg: The same issue for video 2054 with frame id 373: The problem appears to be with the last phase segment, whereas the others appear to be correct.

Hi, Thank you for sharing this.

1.Is the code provided for frame extraction under this repository correct? →We found that videos from BDD, there may be some that have 60fps settings in video metadata information, but actually 30fps video with the same frame appeared two times. Thus, using OpenCV for these videos, the default FPS will be set to 30fps for frame extraction. We have already updated our frame extraction code to deal with this case, please have a try. Also, FFmpeg will read video frames as the FPS in metadata settings, so the frame will be correct. You could also use FFmpeg for frame extraction.

2.I also noticed several bounding boxes that appear to be inaccurate →We confirmed this is caused by human error, and now only a limited number of videos (around 0.9% videos) will have this kind of result on BDD, also restricted to the last phase = phase 4. We will give an update soon in the coming days.

Thank you.

woven-visionai / wts-dataset

Incorrect bounding box annotation ? #4