huggingface / transformers

šŸ¤— Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.99k stars 27.01k forks source link

Inconsistency with transformer pipeline results #29960

Closed abhijeetk597 closed 6 months ago

abhijeetk597 commented 7 months ago

System Info

Who can help?

No response

Information

Tasks

Reproduction

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

MODEL = f"cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

sent_pipeline = pipeline("sentiment-analysis",model=model, tokenizer=tokenizer)

import random
random_text = random.choice(df["Text"])

print(random_text)
sent_pipeline("random_text")

Screenshot 2024-03-30 052904

I tried same thing with different model but got same kind of result.

Screenshot 2024-03-30 053110

Kaggle Notebook Link

Expected behavior

Sentiment scores should vary as per text given.

vasqu commented 7 months ago

You pass the string "random_text" but not the variable's (random_text) content. That's why you get the same score.

Try passing it like sent_pipeline(f'{random_text}'). So in total something like:

from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification

MODEL = f"cardiffnlp/twitter-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)

sent_pipeline = pipeline("sentiment-analysis",model=model, tokenizer=tokenizer)

import random
random_text = random.choice(df["Text"])

print(random_text)
# here is the change
sent_pipeline(f"{random_text}")
github-actions[bot] commented 6 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.