huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.11k stars 27.04k forks source link

Bug in Marian model (or tokenizer) in transformers==4.18.0 #16670

Closed MorenoLaQuatra closed 2 years ago

MorenoLaQuatra commented 2 years ago

Environment info

Who can help

@patil-suraj

Information

Model I am using (Bert, XLNet ...): Marian

The problem arises when using:

The tasks I am working on is:

To reproduce

Steps to reproduce the behavior:

  1. Extend the tokenizer using a target one
  2. Add tokens
  3. Run forward with model in training mode.
  4. script and error reported here: https://colab.research.google.com/drive/1utS-L1iO1paiwKKPNqVHW5ARvprfRgG2?usp=sharing

Traceback below:

[/usr/local/lib/python3.7/dist-packages/transformers/models/marian/modeling_marian.py](https://localhost:8080/#) in forward(self, input_ids, attention_mask, decoder_input_ids, decoder_attention_mask, head_mask, decoder_head_mask, cross_attn_head_mask, encoder_outputs, past_key_values, inputs_embeds, decoder_inputs_embeds, labels, use_cache, output_attentions, output_hidden_states, return_dict)
   1452         if labels is not None:
   1453             loss_fct = CrossEntropyLoss()
-> 1454             masked_lm_loss = loss_fct(lm_logits.view(-1, self.target_vocab_size), labels.view(-1))
   1455 
   1456         if not return_dict:

RuntimeError: shape '[-1, 65001]' is invalid for input of size 8320768

Expected behavior

Standard Marian training output. No issue with transformers==4.17.0

patil-suraj commented 2 years ago

Good catch! Fix is here #16700

MorenoLaQuatra commented 2 years ago

Thank you!