jeya-maria-jose / Medical-Transformer

Official Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation" - MICCAI 2021
MIT License
791 stars 176 forks source link

Why the requires_grad of f_qr, f_kr, f_sve, f_sv is False? #57

Closed zhouweii234 closed 2 years ago

zhouweii234 commented 2 years ago

Why the requires_grad of f_qr, f_kr, f_sve, f_sv is False? In this way, these parameters cannot be trained, but in your paper these parameters are learnable.

jeya-maria-jose commented 2 years ago

The gates are made learnable after some initial epochs . Please check #16