GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
92 stars 15 forks source link

Implement Product Review Sentiment Dataset #219

Closed onatyap closed 3 years ago

onatyap commented 3 years ago

Previous datasets were rejected due to dataset size, I found a larger dataset in huggingface. turkish_product_reviews

This is not benchmarked but benchmarked datasets are hybrid which means they contain movie reviews, twitter posts and hotel reviews, which can be addressed as seperate datasets.

husnusensoy commented 3 years ago

Thanks for your recommendation. Here are two options for you

  1. If there is already an academic work which we can bencmark ourselves please do refer to that one so that we can apply their method
  2. Otherwise let's assume that this is a uniform dataset to see how well we perform over such a mixed copus ;)

In any way waiting for your baseline model

Regards,

dafajon commented 3 years ago

PR #226 is merged.