harvardnlp / annotated-transformer

An annotated implementation of the Transformer paper.
http://nlp.seas.harvard.edu/annotated-transformer
MIT License
5.7k stars 1.23k forks source link

from SimpleLossCompute.__call__ to LabelSmoothing.forward, why dose the dim of x change? #43

Closed BUCTwangkun closed 2 years ago

BUCTwangkun commented 5 years ago

Thank you for a great piece.

I have a question about the method foward in LabelSmoothing. I debug it by Pycharm and the excuting order is from loss = self.criterion(x.contiguous().view(-1, x.size(-1)), #[30, 9, 512] --> [270, 512] y.contiguous().view(-1)) / norm #[30, 9] --> [270] (SimpleLossCompute's method call) to method forward in LabelSmoothing def forward(self, x, target): #x is the output of model, target is label assert x.size(1) == self.size true_dist = x.data.clone() .... why the dim of x changes from [270, 512] to [270, 11]

hellokevin96 commented 4 years ago

hello, I have a same problem, do you fix this bug?