Question about number of <person, predicate, object> in VidHOI dataset

BingliangLi commented 6 months ago

Thanks a lot for open source your work! I'm having some trouble with the annotation file, as stated in the ST-HOI paper, VidHOI have 557 different combination of <person, predicate, object>, however in your repo vidhoi_related/dataset_info_train.json and vidhoi_related/dataset_info_train.json (and I also counted myself), there are more than 1000 combination HOIs, so how do we explain this conflict? How should I use this information? To be more specific, what does 'triplet_train_hist', 'triplet_val_hist', 'triplet_val_unique' means? Any help is greatly appreciated!

nizhf commented 6 months ago

Thank you for your interest in our work.

First, to your specific question, "hist" is the histogram. At the beginning of this work, for faster validation of our model (test on the entire validation set took a long time), we split the training dataset into the training subset and validation subset and counted the triplet appearance in each. "triplet_val_unique" means the unique triplets in the validation subset. This training dataset split is not used in the final experiments. We use the full training dataset to train and the full validation dataset to report the results.

About the number of triplet categories, I was also confused when I saw the counts in the training set. But when you check the dataset_info_val.json, you can see there are exactly 557 triplet categories. So, I think that number means the number of combinations in the validation dataset...

BingliangLi commented 6 months ago

Thank you very much for the explanation! Very much appreciated!

nizhf / hoi-prediction-gaze-transformer

Question about number of <person, predicate, object> in VidHOI dataset #8