eric-mitchell / direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)
Apache License 2.0
2k stars 162 forks source link

Questions about the IMDB Sentiment dataset #45

Open stevie1023 opened 11 months ago

stevie1023 commented 11 months ago

Hi there, I was reading your paper about the sentiment task: "... In order to perform a controlled evaluation, for this experiment we generate preference pairs over generations using a pre-trained sentiment classifier." I was wondering how you generate the preference pairs exactly. Do you additionally generate reviews given some prompts( several words of the review?) with another LM or else?

Thanks for your reply in advance~

insublee commented 10 months ago

how about using this? https://huggingface.co/datasets/insub/imdb_prefix20_forDPO_gpt2-large-imdb-FT_siebert_sentiment-roberta-large-english

stevie1023 commented 10 months ago

how about using this? https://huggingface.co/datasets/insub/imdb_prefix20_forDPO_gpt2-large-imdb-FT_siebert_sentiment-roberta-large-english

This really helps! Great thanks~ :)

Vance0124 commented 8 months ago

how about using this? https://huggingface.co/datasets/insub/imdb_prefix20_forDPO_gpt2-large-imdb-FT_siebert_sentiment-roberta-large-english

I use "siebert/sentiment-roberta-large-english" to evaluate the score of chosen response and the score of rejected response of the dataset, and the result is that most pairs have similar scores. Is this normal? Here are some examples: code:

from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("siebert/sentiment-roberta-large-english", cache_dir=cache_dir)
tokenizer = AutoTokenizer.from_pretrained("siebert/sentiment-roberta-large-english", cache_dir=cache_dir)
sentiment_analysis = pipeline(task="sentiment-analysis", model=model, tokenizer=tokenizer, truncation=True, max_length=512)
print(sentiment_analysis([
                          "Terrible movie. Nuff Said.<br /><br />These Lines were a joke, they didn't make any sense, they just threw all the things, i mean, i loved this movie because it was so funny. <br /><br />I loved this movie. I mean, they were supposed to be the best actors in the movie. It was great and so funny and also amazing at that. All of the guys in this movie are really nice. <br /><br",
                          "Terrible movie. Nuff Said.<br /><br />These Lines were written when the scriptwriter was a total wimp. He forgot the meaning of the first line, so he added lines.<br /><br />I mean, when you can't spell the word 'pauline' correctly, and you can't even spell this movie's tagline, the movie's not even worth it. This was a terrible waste of good actors and a terribly written screenplay.<br /><"
                          ]))
print(sentiment_analysis([
                          "Sometimes a movie is so comprehensively awful it has a destructive effect on your sense of style. This is a movie that is just about terrible and has no redeeming qualities whatsoever. The only way I can even begin to explain this bad movie is to warn you not to rent this crappy and poorly edited film, because you will literally regret watching it. The acting is nothing less than terrible, the script is written by the most idiotic writer this side of Stephen Schwartz (a great movie writer, I",
                          "Sometimes a movie is so comprehensively awful it has a destructive effect on your social life. It can be hard to think of a better example to compare it to. It has some of the most ridiculous moments in movies such as this. It has no redeeming features and as a comedy it was a disaster. I could sit here for days bashing this movie in vain but I'm tired.<br /><br />I recommend you watch this at the first instance and try to see if it's your",
                          ]))
print(sentiment_analysis([
                          "I have to congratulate the genius who approved this one. Edward Furlong is great as the lead with a wonderful performance from Robert Downey Jr. The supporting cast is superb too. One of the greatest movie debuts I have ever seen, in my opinion.<br /><br />I have yet to see it, as I have a copy of this one with no DVD player, but after this, I'll be able to find it. I would recommend this for all ages. I think",
                          "I have to congratulate the genius who approved this one. Edward Furlong is an amazing film-maker and he is one of the most versatile. He knows that you can only express your ideas through your characters. He makes you believe in a good man, a tough man, and an over-protective father. The most disturbing thing to me was the part that ended with the character's death. The film has a wonderful ending, and it is highly suspenseful. Furlong is a",
                          ]))

and the scores of the above 3 pairs (chosen, rejected):

[{'label': 'NEGATIVE', 'score': 0.999480664730072}, {'label': 'NEGATIVE', 'score': 0.9995088577270508}]
[{'label': 'NEGATIVE', 'score': 0.9995110034942627}, {'label': 'NEGATIVE', 'score': 0.9995111227035522}]
[{'label': 'POSITIVE', 'score': 0.9989092350006104}, {'label': 'POSITIVE', 'score': 0.9989309906959534}]

Is this normal?