huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.75k stars 26.95k forks source link

HubertForSequenceClassification does not handle regression tasks correctly; always uses CrossEntropyLoss #33500

Closed raopx closed 1 month ago

raopx commented 1 month ago

System Info

Who can help?

@ylacombe @eustlb

Information

Tasks

Reproduction

from transformers import AutoConfig, HubertForSequenceClassification

model_id = "ntu-spml/distilhubert"

# Configure the model for regression
config = AutoConfig.from_pretrained(model_id)
config.problem_type = "regression"
config.num_labels = 1

# Load the model with the configuration
model = HubertForSequenceClassification.from_pretrained(model_id, config=config)

# Prepare a sample input
import torch
batch = {
    'input_values': torch.randn(1, 16000),  # Example input tensor
    'labels': torch.tensor([120.0])         # Example label (float for regression)
}

outputs = model(input_values=batch['input_values'], labels=batch['labels'])

Analysis:

After investigating the issue, I found that the HubertForSequenceClassification class does not correctly handle the problem_type parameter in its forward method. Specifically, it always uses CrossEntropyLoss, regardless of whether the task is a classification or regression task.

Here is the relevant code from transformers/models/hubert/modeling_hubert.py:

https://github.com/huggingface/transformers/blob/8bd2b1e8c23234cd607ca8d63f53c1edfea27462/src/transformers/models/hubert/modeling_hubert.py#L1633C9-L1637C1

# Inside HubertForSequenceClassification.forward()
if labels is not None:
    loss_fct = CrossEntropyLoss()
    loss = loss_fct(logits.view(-1, self.config.num_labels), labels.view(-1))

Expected behavior

In other model implementations like BertForSequenceClassification, the forward method correctly handles different problem_type settings:

# Inside BertForSequenceClassification.forward()
if labels is not None:
    if self.config.problem_type == "regression":
        loss_fct = MSELoss()
        loss = loss_fct(logits.view(-1), labels.view(-1))
    elif self.config.problem_type == "single_label_classification":
        loss_fct = CrossEntropyLoss()
        loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))
    # ...
aroun-coumar commented 1 month ago

Hey @raopx I'll take up this issue, i'll first test and verify in my local and soon create a PR if needed , Thanks

ylacombe commented 1 month ago

Thanks for opening this issue @raopx! @aroun-coumar, let us know how it goes and if you need help!

aroun-coumar commented 1 month ago

Sure @ylacombe , I just started and i'll let you know Thanks

aroun-coumar commented 1 month ago

Please checkout this PR 33551