Using BERTje for sentiment classification

wietsedv / bertje

BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s layers? A closer look at the NLP pipeline in monolingual and multilingual models"

Apache License 2.0

135 stars 10 forks source link

Hi Wietse!

I am trying classify given texts (usually about 100 words) as either positive or negative. How would I go about doing that with BERTje?

I tried the following based of off the fill-mask example that is shared in the READme and on Huggingface.

model = pipeline("sentiment-analysis", model='GroNLP/bert-base-dutch-cased')
negative_dutch_text = 'Dat is heel vervelend om te horen! Ik ben ook heel boos hierover. Wat een rotzooi.' 
model(negative_dutch_text)

For every sentence this outputs LABEL_0 with a score somewhere around 0.55. I would expect this to be strongly negative. In what way are my expectations off? How would I go about using it to classify it positive or negatively?

Thanks alot!

wietsedv / bertje

Using BERTje for sentiment classification #26