codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.11k stars 1.29k forks source link

Tensor transform question in pretrain.py #35

Closed wlpdlut closed 5 years ago

wlpdlut commented 5 years ago

There is a line like below in pretrain.py,

mask_loss = self.criterion(mask_lm_output.transpose(1, 2), data["bert_label"])

I run it, and find "mask_lm_output" is like "batch_sizeinput_lengthvocab_size", and "data["bert_label"]" like "batch_size*input_length", if transpose as above, Does it make sense ? I am confused.

codertimo commented 5 years ago

@wlpdlut It's because I used the NLLLoss for criterion. NLLLoss input should be (batch_size, N_class, seq_len), please check the docs in under link. https://pytorch.org/docs/stable/nn.html?highlight=nll#torch.nn.NLLLoss

Please tell me if you have any other question, thanks 👍

wlpdlut commented 5 years ago

Thank you so much, I understand it now.