Open ErikVogelLUH opened 2 weeks ago
Hey @ErikVogelLUH! It's a bit hard to help you without knowing the code that failed. We know this model works well with our scripts (such as this one: https://github.com/huggingface/transformers/tree/main/examples/pytorch/text-classification)
Did you start from one of our scripts or from something else?
System Info
transformers version 4.41.2, Windows 10
Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I wanted to train a Roberta model for classification. However, during the computation of the loss for multi-label classification, the dimensions are mismatched. I found the problem in transformers/models/roberta/modeling_roberta.py:1229:
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
The labels are flattened but the logits are formed to have dimensions (batch_size, num_labels). Slightly changing the line fixes the problem
loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1, self.num_labels))
Expected behavior
Compute the loss without the dimension mismatch