about feature matrix which is the input of the last fc layer

raoyongming / CAL

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

MIT License

145 stars 25 forks source link

about feature matrix which is the input of the last fc layer #4

Closed lynlindasy closed 3 years ago

lynlindasy commented 3 years ago

Sorry to bother you, I have a question about the fgvc part. Why the normalized feature_matrix and feature_matrix_hat need to multiply 100 before the fc layer?

raoyongming commented 3 years ago

Hi, thanks for your interest in our work. We use the trick following the implementation of WS-DAN. This trick is useful to improve the final performance while introducing no significant extra computations. There are some discussions (https://github.com/GuYuc/WS-DAN.PyTorch/issues/1) on this detail in the WS-GAN repo.

lynlindasy commented 3 years ago

Hi, thanks for your interest in our work. We use the trick following the implementation of WS-DAN. This trick is useful to improve the final performance while introducing no significant extra computations. There are some discussions (GuYuc/WS-DAN.PyTorch#1) on this detail in the WS-GAN repo.

Thanks for your reply and I have another question about this paper. If we want to quantify the effect of attention map and use it to optimize the learning process, why we need a counterfactual attention instead of no attention? I'm a little confused with the necessity of the counterfactual attention.

raoyongming commented 3 years ago

Interesting question! Since we are considering the attention-based models here, "no attention" actually can also be regarded as a type of counterfactual attention. It is the "uniform attention" compared in Table 4. We see the uniform counterfactual attention can also achieve comparable performance with the "random attention" used in most of our experiments.

lynlindasy commented 3 years ago

Interesting question! Since we are considering the attention-based models here, "no attention" actually can also be regarded as a type of counterfactual attention. It is the "uniform attention" compared in Table 4. We see the uniform counterfactual attention can also achieve comparable performance with the "random attention" used in most of our experiments.

Thank you a lot! I just started my work on fine-grained classification and I think your work is very interesting. Maybe I will try this method on my dataset later. Thank you again!