GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
92 stars 15 forks source link

Tweet Sentiment Prebuilt Model #100

Closed husnusensoy closed 3 years ago

husnusensoy commented 3 years ago

Can somebody use sadedeGel to create a supervised classifier for any text classification problem in Turkish.

husnusensoy commented 3 years ago

This can be added into repository as a binder notebook

askarbozcan commented 3 years ago

Do you have any datasets in mind? I found this: https://github.com/sercankulcu/twitterdata

Will implement it with tfidf vectors as soon as other issues are fixed.

UPD: Another (better) candidate: https://coltekin.github.io/offensive-turkish/

husnusensoy commented 3 years ago

Whatever. We are not addressing to apply to a single problem. Let's do several applications and publish our best results per problem. I keep this open and open two more tickets.

husnusensoy commented 3 years ago

@askarbozcan can you explain the issue you have on tfidf ? "Will implement it with tfidf vectors as soon as other issues are fixed."

askarbozcan commented 3 years ago

@askarbozcan can you explain the issue you have on tfidf ?

I meant as soon as other PRs related to tfidf get merged into develop.

husnusensoy commented 3 years ago

I think you don't need to wait for it. Implement with the as is capabilities and ensure that results are reproducible. Once we have updated things we can rerun to obtain new results.

husnusensoy commented 3 years ago

@askarbozcan can you come up with a prebuilt model similar to sadedegel.prebuilt.news_classifier for twitter sentiment analysis.

husnusensoy commented 3 years ago

Obviously you can properly package notebook/Basic Tweet Classification with Sadedegel.ipynb