Svetuf / CSC_Sentiment_Russian_NER

0 stars 0 forks source link

Training of the first sentiment model #1

Closed roddar92 closed 3 years ago

roddar92 commented 3 years ago

Split of sample on train and test:

Algorithm:

Extension of word2vec:

  1. Train TF-IDF model (bag-of-words) on the normalized corpus
  2. TF = frequency of a word in document, IDF = log(N/#{count of documents with a word})
  3. word2vec.get_vector(word) * idf(word)

Document = Tweet