ByChelsea / VAND-APRIL-GAN

[CVPR 2023 Workshop] VAND Challenge: 1st Place on Zero-shot AD and 4th Place on Few-shot AD
193 stars 22 forks source link

initialized by other weight #28

Closed wangyf8848 closed 6 months ago

wangyf8848 commented 6 months ago

Hello, may I ask if you think the linear layer used for the network middle layer is to train the text-language alignment ability from scratch? And have you tried to train the framwork with initialized by other fine pretrained feature extractor(have not been trained by clip)?

ByChelsea commented 6 months ago

In fact, I cannot assert with certainty the behavior of the intermediate layers, but what can be confirmed is that fine-tuning is indeed very useful. Therefore, you can also try other feature extractors, such as DINO.

wangyf8848 commented 6 months ago

Ok, thank you very much for your reply!