raoyongming / DenseCLIP

[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
505 stars 38 forks source link

Question about implementation of CLIPResNet #29

Closed littlepenguin89106 closed 1 year ago

littlepenguin89106 commented 1 year ago

Hi, The modified ResNet in CLIP uses attention pooling, and the comments in CLIPResNet class are also noted that. However, I didn't see any related operations in CLIPResNet. I know there is another CLIPResNetWithAttention class, but according the configs, I think it is for DenseCLIP not CLIP?

raoyongming commented 1 year ago

Hi @littlepenguin89106, sorry for the confusion. The pretrained CLIP ResNet models are based on CLIPResNetWithAttention class. The CLIPResNet is the modified version that is only used in our early experiments to verify whether attention pooling is necessary.

littlepenguin89106 commented 1 year ago

Ok, Thanks for your reply!