Attention Heads in VisionTransformer

TheRaez commented 3 months ago

Hello, impressive work.

I am trying to run inference on the models inspired on the code found in main_finetune.py.

I am trying to load the model as such: model = models_vit.dict["vit_base_patch16"]( patch_size=patch_size, img_size=input_size, in_chans=in_c, num_classes=nb_classes, drop_path_rate=drop_path, global_pool=global_pool, )

Still, when trying to load the model I keep getting an error: AttributeError: 'VisionTransformer' object has no attribute 'attention_heads'

I guess the culprit is in line 29 in the models_vit.py file. ---> [29] self.attn_bias = get_alibi(attention_heads=self.attention_heads, [30] num_patches=self.num_patches)

I cant find any reference to this self.attention_heads anywhere. Did you mean self.num_heads that are the ones used by attention?

moonboy12138 commented 3 months ago

Thanks for being intertsed in our work. I think you should inference with the adequate model which is models_vit_tensor rather than models_vit.

TheRaez commented 3 months ago

Thank you for your quick response.

I will try using the models_vit_tensor instead. Could you clarify what is the main difference between models_vit_tensor and simply models_vit?

Also, The weights from the SpectralGPT+.pth file are only appliable to these models_vit_tensor or are also appliable to other models like the ones found in models_mae_spectral ?

moonboy12138 commented 3 months ago

It is applicable to the model in models_mae_spectral used for pretraining as well. For understanding the differences between models_vit_tensor and models_vit, I recommend reviewing our paper and the associated code.

danfenghong / IEEE_TPAMI_SpectralGPT

Attention Heads in VisionTransformer #18