GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
93 stars 15 forks source link

Food Delivery Reviews Classification Prebuilt Model #290

Open ertugrul-dmr opened 3 years ago

ertugrul-dmr commented 3 years ago

For the purpose of adding different use cases of our prebuilt function we could implement a food delivery review classification model. There are decent datasets available on kaggle here and here (we're going to merge both).

For this, these steps going be taken:

dafajon commented 3 years ago

Can you explain above how you prepared the sentiment_class column from category points. Also the distribution of labels before and after merge. So that your proposal of the dataset will have a valid and grounded explanation since it is not something benchmarked in a paper but we found and prepared for the food domain.

ertugrul-dmr commented 3 years ago

Original dataset creator suggested using mean of three categories into one and if the mean is bigger or equal to 7 the label it as a "Positive" otherwise "Negative". Although it's giving decent results; after doing some trials I decided to choose minimum of the three categories, I believe the real "sentiment" of the reviews are about the lowest scoring category and text includes the complain about that topic. So if an user gives 1 10 10 taking mean of them gives positive sentiment but usually the text includes negative sentiment about the 1 scored category. I got better F-1 score with this approach.

For distribution I'll share the results with you after I double check the process...