Closed stoneyang159 closed 3 years ago
A bit late, but I had the same problem and I think I found out why: the last weight vector of the model is not initialized.
In src/models/resnet.py
, line 31, it is created as:
self.w = nn.Parameter(torch.Tensor(2048, n_class))
Initializating it like torch.nn.Linear
weights:
self.w = torch.empty(2048, n_class)
nn.init.uniform_(self.w, -math.sqrt(1/2048), math.sqrt(1/2048))
self.w = nn.Parameter(self.w)
solved the problem for me.
@ntalabot thanks a lot!
when i tried to train the model in this repo, loss value is nan. Is there something wrong in my config file?
here is my training config :
thanks a lot in advance.