Closed utkd closed 4 years ago
I'm facing the same problem. Since #4874 it seems like it should be just labels
instead of lm_labels
. According to the documentation it should do masked language modeling-loss, but from my debugging it seems like it actually does next word prediction-loss.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
The
BertModel.forward()
method does not expect alm_labels
andmasked_lm_labels
arguments. Yet, it looks like theEncoderDecoderModel.forward()
method calls it's decoder'sforward()
method with those arguments which throws a TypeError when a BertModel is used as a decoder.Am I using the BertModel incorrectly? I can get rid of the error by modifying the EncoderDecoderModel to not use those arguments for the decoder.
Exact Error:
Relevant part of the code:
...
dec_out, dec_cls, enc_out, enc_cls = model(input_ids=inputs, attention_mask=input_masks, decoder_input_ids=targets, decoder_attention_mask=target_masks)