zhihou7 / HOI-CL

Series of work (ECCV2020, CVPR2021, CVPR2021, ECCV2022) about Compositional Learning for Human-Object Interaction Exploration
https://sites.google.com/view/hoi-cl
MIT License
76 stars 11 forks source link

Ablation Study in FCL #11

Closed RawFisher closed 2 years ago

RawFisher commented 2 years ago

Hi! Thank for your excellent work! I am confusing about the ablation study. Could you explain every noise_type value meaning and which one stand for verb fabricator?(noise_type=0,2,3,8,4,5,7,6) https://github.com/zhihou7/HOI-CL/blob/b9ed42cbdde9fffe1d436068793db392f9011a79/lib/networks/Fabricator.py#L160-L182

zhihou7 commented 2 years ago

Sorry for the confusing code. This includes some redundant code. To make sure the dimension of Fabricator is unchanged, we add empty (zero variables) variable to the input. However, I think this is the same as removing the empty variable because the optimizer does not train the corresponding weights ( those weights are died because of Relu).

I have updated the comments in the code. Some ablation studies are really meaningless and stupid.

     noise_type = 0
    if self.network.model_name.__contains__('_woa_'):
        # with verb
        noise_type = 2
    elif self.network.model_name.__contains__('_won_'):
        # no noise, with empty variable to keep dimension of FC unchanged
        noise_type = 3
    elif self.network.model_name.__contains__('_won1_'):
        # no noise
        noise_type = 8
    elif self.network.model_name.__contains__('_n1_'):
        # we use positive noise. Because the verb representation is after relu.
        # this is useless. I can not understand why I tried this.
        noise_type = 4
    elif self.network.model_name.__contains__('_woa1_'):
        # without verb, but we add a placeholder variable to make sure the dimension of FC unchanged.
        noise_type = 5
    elif self.network.model_name.__contains__('_woa2_'):
        # without verb, but we add a duplicate word embedding to make sure the dimension of FC unchanged.
        # However, this is a little bug because the dimensions of word embedding and Verb representation are different.
        noise_type = 7
    elif self.network.model_name.__contains__('_woo_'):
        # without object, this is for verb fabricator.
        noise_type = 6
RawFisher commented 2 years ago

Thanks for your replying! I have one more question about the verb fabricator. What does verb fabricator mean? Is it just removing the object identify embedding in fabricated compositional branch or using verb identify embedding, noise and verb to fabricator to generate faked verb? In the paper image image Whether row 4 in table 4 and row 4 in table 19 are the same setting?

zhihou7 commented 2 years ago

It is using verb identify embedding, noise and verb to fabricator to generate faked verb. Experimentally, this still works a little. We think balancing the data distribution via fabricating features is most important.

No, FCL + verb fabricator is using object and verb fabricator jointly, while verb fabricator means we only use verb fabricator. In fact, verb fabricator only still benefit zero-shot setting, while verb fabricator only has limited improvement on long-tailed setting ( the full setting in Table. 19). However, when we use verb fabricator and object fabricator jointly, we did not observe further improvement than object fabricator only (row 4 in Table 4)

Hope this response is helpful.

Regards,

RawFisher commented 2 years ago

Thank you very much for your answer! Your answer solves my question! During reading the code, I have another question about the object identify embedding. https://github.com/zhihou7/HOI-CL/blob/735c6940b75c504399ece9e801eeb25924f2da21/lib/networks/Fabricator.py#L22-L31 In fabricator, the object identify embedding is randomly initalized and it is trainable(line 24). Does it indicate that object identify embedding would be updated during training? (I am not familiar with tensorflow) I think object identify embedding is produced offline like one-hot or word embedding and would not be updated during training. Maybe I misunderstand?

zhihou7 commented 2 years ago

Yes. It is updated in an end-to-end way during training (I also emphasize this in the paper). It is different from one-hot and word embedding.

RawFisher commented 2 years ago

Thank you for answering patiently. Sincere thanks.