ltttpku / CMD-SE-release

9 stars 2 forks source link

Question for the zero-shot method evaluation on swig dataset #2

Closed ChelsieLei closed 3 months ago

ChelsieLei commented 3 months ago

Hi @ltttpku,

Thanks for your nice work! I saw in your supplementary materials, you provided the results of the HOICLIP model trained on HICO-DET and tested on swig. However, HOICLIP requires the pre-defined HOI classes to obtain the verb representation (shown in Fig. 4 in HOICLIP paper), which is used in the HOI prediction. How do you deal with this problem? Thanks a lot!

ltttpku commented 3 months ago

Thanks for your interest in our work! We provide the model with predefined HOI classes during inference.

ChelsieLei commented 3 months ago

Hi, thank you for your reply. HOICLIP has a verb classifier that requires the pre-defined unseen classes in training. However, based on the open-vocabulary setting, the unseen classes are only available after training and before inference. Thus, I wonder how you solve this problem when testing HICODET on the SWIG dataset which has unseen classes after the HOICLIP training.

ltttpku commented 3 months ago

I remember that I trained HOICLIP on the HICO-DET dataset, using all HOI classes from HICO-DET by default. During testing, I replaced the embeddings with those from SWIG-HOI.

ChelsieLei commented 3 months ago

Hi, I mean this module in red circle requires the unseen class information. How do you use the verb classifier? The code of the verb implementation is here [https://github.com/Artanic30/HOICLIP/blob/main/models/models_hoiclip/hoiclip.py#L202]

image
ltttpku commented 3 months ago

The verb classifier was discarded due to the lack of data for visual semantic arithmetic.

ChelsieLei commented 3 months ago

Thanks for your information!