VisDrone Video data annotation format

aarti-b commented 9 months ago

I was trying to plot the ground truth labels on video dataset of visdrone. The annotation format is like -

Visdrones Video Detection dev- test set - ann = 98 ,0 ,808 ,1 ,47 ,22 ,1 ,4 , 0, 0

I am aware of the DET format and in this VID format ann[2] to ann[5] is bbox, ann[6] is category.

Is ann[7] score?
ann[8] = truncation?
ann[9] = occlusion? I assumed ann[0] as index and ann[1] as frame. But when I plot the ground truth on the image I found ann[0] as frame and now I am not sure what ann[1] is.

Could you please clarify what annotations are? Thanks.

aarti-b commented 9 months ago

I was trying to plot the ground truth labels on video dataset of visdrone. The annotation format is like -

Visdrones Video Detection dev- test set - ann = 98 ,0 ,808 ,1 ,47 ,22 ,1 ,4 , 0, 0

I am aware of the DET format and in this VID format ann[2] to ann[5] is bbox, ann[6] is category.

Is ann[7] score?

ann[8] = truncation?

ann[9] = occlusion? I assumed ann[0] as index and ann[1] as frame. But when I plot the ground truth on the image I found ann[0] as frame and now I am not sure what ann[1] is.

Could you please clarify what annotations are? Thanks.

We did some brainstorming, found ann[7] is class category. And - 0- group, 1-walking people, 2-stationary people, 3-bicycles, 4-cars, 5-Van, 10 - tryvehicle

rest are yet to find. Please help in understanding ann[1].

littlewuuu commented 7 months ago

I tried to visualize the annotation and find out ann[1] might mean the object number in this frame

AneraSong commented 7 months ago

maybe can refer to this issue?https://github.com/VisDrone/VisDrone-Dataset/issues/34

VisDrone / VisDrone-Dataset

VisDrone Video data annotation format #40