TACJu / TransFG

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).
MIT License
384 stars 87 forks source link

[minor error]The linear layer self.out of Attention in file modeling #21

Open Serayah1376 opened 2 years ago

Serayah1376 commented 2 years ago

May be there is something wrong with the first argument of the linear layer self.out (line 78: self.out = Linear(config.hidden_size, config.hidden_size)) of function Attention in the modeling file and should be changed to self.out = Linear(self.all_head_size, config.hidden_size). Because in some cases, config.hidden_size and self.all_head_size might not be equal.