fredzzhang / upt

[CVPR'22] Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"
https://fredzzhang.com/unary-pairwise-transformers
BSD 3-Clause "New" or "Revised" License
144 stars 26 forks source link

A question about v-coco dataset #65

Closed ltttpku closed 1 year ago

ltttpku commented 1 year ago

Hi, thank you for your code; But i have a question about the V-COCO dataset you implemented.

V-COCO is a subset of COCO dataset and has 10, 396 images (5,400 for training and 4,964 for testing) as stated in paper. However, when executing your code for v-coco dataset, I foundlen(trainset)==4969and len(testset)==4532. The reported number of training images does not match the actual one.

fredzzhang commented 1 year ago

Hi @ltttpku,

If I remember this correctly, there are images without sufficient bounding box annotations. So I removed those images. It's the same situation with HICO-DET as well. The authors claimed there are 38118 training images, while in fact only 37633 of those images have bounding box annotations.

Fred.

ltttpku commented 1 year ago

ohh, I see. Thanks!