Open ayush1999 opened 6 years ago
Benchmarking our existing model against standard datasets would be good place to start.
Sentiment analysis is a task that can be hard to transfer across different domains. Positive movie reviews may use different terminology when compared to positive video game reviews.
Should the emphasis be on making a generalizable model or making a decent model that you can easily augment with your own data in the target domain?
The latter, I would think. In any case, getting a good baseline+testcases would help in either case.
I have a repository where I benchmarked the current system.
The performance is very poor at 52% accuracy (only slightly above random chance).
This may be related to issue #129
@aviks In issue #129, you said the proposed change breaks the test cases, would you suggest fixing the model or writing a new one from scratch?
The current sentiment analysis model isn't very good, and needs to be changed (as discussed with @aviks ). Also, following the discussion in #83 , it'd be better to warn the user before skipping works not in vocabulary.