Closed MorenoLaQuatra closed 2 years ago
transformers
@patil-suraj
Model I am using (Bert, XLNet ...): Marian
The problem arises when using:
The tasks I am working on is:
Steps to reproduce the behavior:
Traceback below:
[/usr/local/lib/python3.7/dist-packages/transformers/models/marian/modeling_marian.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, decoder_input_ids, decoder_attention_mask, head_mask, decoder_head_mask, cross_attn_head_mask, encoder_outputs, past_key_values, inputs_embeds, decoder_inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict) 1452 if labels is not None: 1453 loss_fct = CrossEntropyLoss() -> 1454 masked_lm_loss = loss_fct(lm_logits.view(-1, self.target_vocab_size), labels.view(-1)) 1455 1456 if not return_dict: RuntimeError: shape '[-1, 65001]' is invalid for input of size 8320768
Standard Marian training output. No issue with transformers==4.17.0
transformers==4.17.0
Good catch! Fix is here #16700
Thank you!
Environment info
transformers
version: 4.18.0Who can help
@patil-suraj
Information
Model I am using (Bert, XLNet ...): Marian
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Traceback below:
Expected behavior
Standard Marian training output. No issue with
transformers==4.17.0