YueLiao / PPDM

Code for "PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection".
MIT License
219 stars 42 forks source link

Custom Dataset - Meaning of "subject_id" and "hoi_category_id" #57

Open Gaussianer opened 2 years ago

Gaussianer commented 2 years ago

Hello, I would like to train your model with Custom Data. While creating the annotation I came across your annotation files. Here I found the following: In the field "annotations" the corresponding bounding boxes and a category_id are noted. The "bbox" probably corresponds to each annotated subject or object. As I understand it, the category_id for each annotation in the "annotations" field is the word list you posted here. Can you confirm my assumption here so far?

Furthermore I have a question about the field "hoi_annotation". What is the meaning of subject_id, object_id and hoi_category_id in each HOI annotation?

I have attached a sample annotation of the trainval_hico.json file. Here I have commented on what is unclear. How these values come about.:

    {
        "file_name": "HICO_train2015_00000009.jpg",
        "img_id": 9,
        "annotations": [
            {  // Index 0
                "bbox": [
                    190,
                    101,
                    290,
                    305
                ],
                "category_id": 1 // 1 = 'person' - The ID comes from the word list
            },
            { // Index 1
                "bbox": [
                    210,
                    99,
                    431,
                    335
                ],
                "category_id": 8 // 8 = 'truck'  - The ID comes from the word list
            },
            { // Index 2
                "bbox": [
                    339,
                    93,
                    597,
                    406
                ],
                "category_id": 1 // 1 = 'person'  - The ID comes from the word list
            }
        ],
        "hoi_annotation": [
            {
                "subject_id": 0, // Reference to the element of "annotations" with index 0
                "object_id": 1,  // Reference to the element of "annotations" with index 1
                "category_id": 53, // Verb category - According to the word list, the verb with ID = 53 = "load"
                "hoi_category_id": 571 // HOI triplet category
            },
            {
                "subject_id": 2, // Reference to the element of "annotations" with index 2
                "object_id": 1, // Reference to the element of "annotations" with index 1
                "category_id": 53, // Verb category - According to the word list, the verb with ID = 53 = "load"
                "hoi_category_id": 571 // HOI triplet category
            }
        ]
    }

The corresponding visualized annotation and image look like this here: image

YueLiao commented 2 years ago

Hi, the subject_id and object_id is the index of the box annotations list. And category_id in annotations means the category of this box and category_id and hoi_catetory_id in hoi_annotation mean the verb category and HOI triplet category, respectively, but we only use the verb category.

Gaussianer commented 2 years ago

Hi @YueLiao , thank you for your quick reply. I have probably understood it now. I have added comments to the code block above so that this is more understandable. I hope that others will also benefit from this.

Do you have an example for the HOI triplet category? Possibly also a word list?

Thank you

SWT-1014 commented 11 months ago

Hello, I still have some questions about category_id, because after I checked the word list, the ID corresponding to person does not seem to be 1, and the ID corresponding to truck is not 8. Which word list are you referring to? Thanks!