A few observations. - Githubissues

shayneoneill commented 2 years ago

I had a run through with this model classifying comments on a petition that had a lot of traction with antivaxers. The results where pretty mixed.

Some observations. The model really needs an initial classifier to be run trained to ask a more basic question "Is this comment about covid at all?" I found it gave negative or positive classifiers to comments that really didnt apply (Ie one stating something to the effect of "I havent seen my family in a year because of border restrictions" which I , or the algorithm, really has no way of evaluating for truthfulness).

This is to be expected, as the model is for classifying covid statements and thus has no real frame of reference for dealing with statements that arent actually about covid per se.

What I DID notice, is of those misclassified statements, it seems to be honing in on the hostility of the statements. As the petition is largely about dropping australian border restrictions and vaccine mandates, its to be expected that a large number of signers are antivax activists who have had a a tendency to be somewhat aggressive. That has me wondering if the model is actually responding to the tone of 'voice' in the comments, producing strong "false" or "misleading" signals if the input text is aggressive in nature?

edit: Oh and further context, the petition was one claiming to be "West australian doctors", a quick plugging in of names into the registar of medical practicioners revealed that the majority of signers are not actually medical practicioners (and worse, theres some evidence that of those that are, at least some might have had their names entered onto the petition without consent w/ data coming from the registry of practicioners) or are practitioners from non-medical fields like chiropractors and other pseudoscience professions, so I'm not sure how that impacts on the result. Perhaps the algorithm is picking up untruthfulness signals that I'm missing myself.

mar-muel commented 2 years ago

responding to the tone of 'voice' in the comments, producing strong "false" or "misleading" signals if the input text is aggressive in nature?

Sometimes a thorough error analysis is very insightful! Problematic errors are systematic errors and it's important to reveal them/know how they impact the summary statistics.

In general, your definition of "fake" might also overlap partially with "non-rational"/agitated/ALL CAPS comments. So it's worth to conduct a similar analysis on your annotation set. Larger models usually require fewer samples to get to a decent accuracy level, so you might be able to clean your annotation data a bit as well (as long as you're not introducing another bias). This usually has a positive impact on scores because your objective is clearer.

Just some thoughts - good luck with the analysis.

peregilk commented 2 years ago

Following up on Martins comments here.

Firstly, the COVID-TWITTER-BERT is starting to get a bit old. It was trained in the beginning of the pandemic. It still does "think" that Malone is a basketball player and that alpha, delta and omikron are letters in the greek alphabet. In some cases the stance/sentiment in a sentence requires you to know the meaning of these words. To fix this, one would have to do some additional pretraining on additional (unannotated) data. Not sure if it would have real impact in your case, just something you should think about.

Another comment is that is the possibility that the model is picking up the "tone of voice" as you describe it during finetuning. Take a minute to think about the process of finetuning a classification task. Lets say you have the task of pro/anti vaccine. You do some annotation, and put the "pro" in pile A and "anti" in pile B. In real life, a lot of these categorisations are really hard. Inter-rater reliability on tasks like this is typically below 0.8. Then you are finetuning your model on this. However, you are no longer finetuning on pro vs anti vaccine. You are finetuning on recreating pile A and pile B. There are a lot of other ways of recreating these piles, for instance the use of specific words, or their anger, or their use of CAPS LOCK.

There are ways of getting around this problem. One approach is to do the classification target specific (where you hint to the label of the piles to give the classificator a hint about what you are looking for). Another approach is not to train on the classification task, but instead view this as a logical task. We have made an mnli-version of the model that can be used for that.

Best of luck with the competition!

digitalepidemiologylab / covid-twitter-bert

A few observations. #23