jeya-maria-jose / Medical-Transformer

Official Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation" - MICCAI 2021
MIT License
799 stars 176 forks source link

Gated Mechanism #70

Closed QiaoSiBo closed 2 years ago

QiaoSiBo commented 2 years ago

In the Gated parameters, why are they all requires_grad=False?

Priority on encoding

    ## Initial values 

    self.f_qr = nn.Parameter(torch.tensor(0.1),  requires_grad=False)
    self.f_kr = nn.Parameter(torch.tensor(0.1),  requires_grad=False)
    self.f_sve = nn.Parameter(torch.tensor(0.1),  requires_grad=False)
    self.f_sv = nn.Parameter(torch.tensor(1.0),  requires_grad=False)
jeya-maria-jose commented 2 years ago

They are not trained for the first few epochs they are made requires_grad = True after few initial epochs.

QiaoSiBo commented 2 years ago

They are not trained for the first few epochs they are made requires_grad = True after few initial epochs. Thanks a lot.