fxnnxc / probe_lm

GNU General Public License v3.0
0 stars 0 forks source link

Find dataset for probing #5

Open stitsyuk opened 1 year ago

stitsyuk commented 1 year ago

Possible datasets for probing task:

  1. GLUE (MNLI section) - Natural Language Inference task. Labels: contradiction, neutral, entailment Example: Premise: How do you know? All this is their information again. Hypothesis: This information belongs to them. Answer: entailment Dataset: https://huggingface.co/datasets/glue/viewer/mnli/train

  2. SNLI (Stanford Natural Language Inference) - same design as GLUE MNLI Labels: contradiction, neutral, entailment Example: Premise: Two blond women are hugging one another. Hypothesis: The women are sleeping. Answer: contradiction Dataset: https://huggingface.co/datasets/snli

  3. HANS (Heuristic Analysis for NLI Systems) - NLI evaluation dataset that tests specific hypotheses about invalid heuristics that NLI models are likely to learn. This is an interesting dataset to do probing since we can compare it to the implementation from the paper where the dataset was introduced (https://arxiv.org/pdf/1902.01007.pdf). They fine-tuned BERT for this task and we can compare if probing internal representations can have better performance for handling invalid heuristics. Labels: entailment, non-entailment Example: The doctors supported the scientist. The scientist supported the doctors. Answer: non-entailment Drawbacks: Since the task was designed to demonstrate that models are likely to follow shallow heuristics and shortcuts, probing can also struggle and have poor results. Dataset: https://github.com/tommccoy1/hans

  4. IMDb Movie Reviews - binary sentiment analysis dataset consisting of 50,000 reviews from the Internet Movie Database (IMDb) labeled as positive or negative. In my opinion, probing can capture positive and negative concepts well from reviews because of important words for the classes (e.g. "bad, awful, boring, sad" for negative, "good, wonderful, interesting, incredible" for positive). Labels: positive, negative Example: Ned aKelly is such an important story to Australians but this movie is awful. It's an Australian story yet it seems like it was set in America. Also Ned was an Australian yet he has an Irish accent...it is the worst film I have seen in a long time. Answer: negative Drawbacks: Some reviews are quite long, about 500 words. If our model can handle long input representations, than this dataset can be good for probing. Dataset: https://huggingface.co/datasets/imdb

  5. Hate Speech Offensive - dataset for hate speech and offensive language detection on tweets. I think that probing can show good results on this data since offensive speech consists "bad words", which are rare in common speech and most datasets. Therefore, this "bad concepts" will be captured well in internal representations. Labels: offensive language, neither, hate speech Example: I spend my money how i want bitch its my business Answer: offensive language Drawbacks: Since the dataset is about hate speech, every example contains swear words, and it is always complicated to write a paper with bad words in experiments or in examples. Dataset: https://huggingface.co/datasets/hate_speech_offensive

  6. GoEmotions - corpus of carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories or neutral. It is interesting to check if it is possible to probe emotion concepts. Labels: admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, neutral Example: Man I love reddit. Answer: love Drawbacks: Many labels (28 emotions) can be hard to train using internal representations. Dataset: https://huggingface.co/datasets/go_emotions?row=17

  7. Tweet sentiment extraction - dataset of tweets with sentiment classes for every sentence. Probing can capture positive and negative concepts well because of important words. Labels: positive, neutral, negative Example: I really really like the song Love Story by Taylor Swift Answer: positive Dataset: https://huggingface.co/datasets/mteb/tweet_sentiment_extraction

  8. BoolQ - question answering dataset for yes/no questions. This dataset is close to true/false SAPLMA dataset, but has a little different design: it has 2 sentences: question and passage, and then the true/false label. Labels: true, false Example: Question: did abraham lincoln write the letter in saving private ryan? Passage: In the 1998 war film Saving Private Ryan, General George Marshall (played by Harve Presnell) reads the Bixby letter to his officers before giving the order to find and send home Private James Francis Ryan after Ryan's three brothers died in battle. Answer: true Drawbacks: Some passages are long. Also, since we already have many true/false dataset from SAPLMA, maybe it is better to introduce more experiments on different tasks for more variety. Dataset: https://huggingface.co/datasets/boolq

fxnnxc commented 1 year ago

I will find knowledge neuron papers for these works.