bureaucratic-labs / dostoevsky

Sentiment analysis library for russian language
MIT License
311 stars 33 forks source link

How to interpret the sentiment? #146

Open DanilZherebtsov opened 3 years ago

DanilZherebtsov commented 3 years ago

Questions:

  1. model.predict() includes argument k, what does it configure?
  2. model.predict() returns values 'speech' & 'skip', what do they mean?
  3. model.predict() returns 'positive', 'negative', 'neutral', how to get a unified sentiment value for the whole sentence? Sometimes positive is the greatest, sometimes negative is the greatest, sometimes neutral is the greatest. I would like to understand how to get a single score of the sentence sentiment in a range of -1 to 1. Any advice?
dveselov commented 3 years ago

Hi,

  1. It is a count of classes to return. Maximum is a k=5, which will return confidence for all classes. Why 5? Because RuSentiment contains 5 classes :)
  2. You can read about it in RuSentiment README.
  3. If you need binary sentiment classification, I think, you can do something like result['positive'] - result['negative'], but I'm not sure.

Also take a look at Sergey Smetanin BERT-based models for sentiment classification: https://github.com/sismetanin/sentiment-analysis-in-russian

DanilZherebtsov commented 3 years ago

Hi, thank you for the answer.

Some of the definitions in RuSentiment README are not clear. For example 'neutral class (unmarked for sentiment)' - what does that mean? If this is something that can't be classified as neither positive nor negative then how is this different from '"skip" class for unclear cases'?