Closed aclifton314 closed 3 years ago
When the model receives inputs that include the labels, it's supposed to produce a tuple of (loss, predictions), where the loss is a scalar. The trainer then uses the loss to calculate the gradients. In this case (or at least in my case when I get a similar error) the trainer appears to be trying to use the predictions not the loss to calculate the gradient. This appears to be because the model is not receiving the 'labels' as input and so is only producing a one tuple of (predictions). You should be able to fix it by passing a value for "labels" in your collator. See for example transformers.DataCollatorForLanguageModeling.
For me, I am getting the same error because the model I choose does not return loss even though I pass labels. It's better to check the model documentation you are using whether model forward() return loss or not. This is the snapshot of BertModel (Model which I choose first) forward() returns. Which does not return any loss value. And this is the snapshot of BertModelLMHeadModel (Model which I choose later) forward() returns. Which return loss value.
@ameasure @MojammelHossain Thank you both for your feedback! Checking the GPT2 documentation showed me an example of what I could set the labels
value to in my collator.
System Info
Pop!_OS 20.04 Pytorch: 1.5.1 Transformers: 3.0.2 Tokenizers: 0.8.1rc1 Python: 3.7.6 Pretrained Model: GPT2 Pretrained Tokenizer: GPT2
Question
I'm getting the following error and I'm not sure how to resolve it:
Here's some sample code: