GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
92 stars 15 forks source link

Control product_sentiment model after data patch #258

Closed husnusensoy closed 3 years ago

husnusensoy commented 3 years ago

product sentiment dataset is patched using human annotated subset of dataset. Please try to improve existing baseline classifier.

Note that I have detected lots of inconsistencies in "human" annotated data!!!

dafajon commented 3 years ago

There is no improvement with the patched version after optimization. Current model is highest performing. Predictions are under error analysis. This branch is up to date with latest develop.

husnusensoy commented 3 years ago

Don't close issues without a documentation/detailed explanations ?!?!

irmakyucel commented 3 years ago

Closing this issue as:

There is no improvement with the patched version after optimization. Current model is highest performing. Predictions are under error analysis. This branch is up to date with latest develop.

Error analysis is done and errors as a result of wrong labelling and unknown tokens due to words with non-Turkish characters are found. The latter of the two can possibly be solved by word normalization. Word normalization is in development (issue #143).

Any improvements on the Product Review Sentiment model can be followed from issue #221.