gamma1 in fan.py - Githubissues

xjtuwh commented 2 years ago

The code fan.py line 536: x = x + self.drop_path(self.gamma1 x_new) line 539: x = x + self.drop_path(self.gamma2 x_new) May you explain the self.gamma1 and self.gamma2 which are not introduced in the paper on arXiv: 2204.12451v2. In addition, can you give me the published paper on ICML?

zhoudaquan commented 2 years ago

The code fan.py line 536: x = x + self.drop_path(self.gamma1 x_new) line 539: x = x + self.drop_path(self.gamma2 x_new) May you explain the self.gamma1 and self.gamma2 which are not introduced in the paper on arXiv: 2204.12451v2. In addition, can you give me the published paper on ICML?

Hi,

The gamma is used to stablize the training for large models, following the same practice in CaiT (https://arxiv.org/abs/2103.17239).

We are preparing the ICML camera version and should be ready in 1-2 days and we will update those details in the next version in arxiv also.

xjtuwh commented 2 years ago

Thank you very much.

NVlabs / FAN

gamma1 in fan.py #9