If I got it right, models.py > ConvAttnPool is the relevant model to the suggested CAML architecture in the article.
Looking in the forward function, I see that the last action occuring before calculating loss is linear (multiplying by final.weight & adding final.bias):
y = self.final.weight.mul(m).sum(dim=2).add(self.final.bias)
but theres no sigmoid after that, as suggested in the paper:
Hi,
If I got it right, models.py > ConvAttnPool is the relevant model to the suggested CAML architecture in the article. Looking in the forward function, I see that the last action occuring before calculating loss is linear (multiplying by final.weight & adding final.bias):
y = self.final.weight.mul(m).sum(dim=2).add(self.final.bias)
but theres no sigmoid after that, as suggested in the paper:
What did I miss?
Thanks :-) Mor