Food Reviews Classification "Multilabel" Version (Prebuilt Model) - Githubissues

GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish

http://sadedegel.ai

MIT License

93 stars 15 forks source link

Food Reviews Classification "Multilabel" Version (Prebuilt Model) #292

Open ertugrul-dmr opened 2 years ago

ertugrul-dmr commented 2 years ago

For the purpose of adding different use cases of our prebuilt function we could implement a food delivery review classification multilevel model. There are decent datasets available on kaggle here and here (we're going to merge and preprocess both).

In total both datasets contains more than 550K instances after preprocessing and removing duplicates.
Data is collected from Turkish food delivery web sites.
The data contains reviews about the food delivery and scoring for several aspects of the service.

Main difference between this and #290 is:

290 gets general binary sentiment of the given text (POSITIVE, NEGATIVE) meanwhile Multilabel version gets binary labels for each category ('speed', 'service', 'flavor') so we can get more fine grained results.

For this, these steps going be taken:

Prepare the dataset for sentiment classification(keeping actual labels this time),
Build/optimize a model and test model accuracy(F-1 macro score),
Publish the more category specific prebuilt model as an alternative to previous binary version.