Closed mrpozzi closed 7 years ago
There are some Reuters Corpus that has been labelled already. Maybe we can build a model based on these data.
Here is a list of dataset for me to investigate.
Notes:
Also conduct literature reviews on these datasets to see how they are labeled.
Potential papers:
accuracy
, precision
etc and their invariance property. The desirable measure will depend on our definition of the problem statement and goal.
This entails both a literature review to understand how they have been labeled and some data mining to find already labeled text basis to build sentiment index