smc / malayalam-text-classifier

Malayalam Text Classifier
MIT License
6 stars 0 forks source link

🆘 Datasets for Malayalam Text Classification #1

Open kurianbenoy opened 2 years ago

kurianbenoy commented 2 years ago

Text Classification is the task of assigning a label or class to a given text. Some use cases are sentiment analysis, natural language inference, and assessing grammatical correctness.

At moment, the number of datasets which are labelled in any category are very few in Malayalam and number of labelled datasets are very small in quantity like just 10-15 numbers.

Datasets found so far are:

It would be great help if you can find some text corpus and comment it out here any useful datasets in Malayalam. Especially we are looking for Text corpus related to category news which is labelled for the preliminary analysis.

kurianbenoy commented 2 years ago

Dataset found:

https://huggingface.co/datasets/rajeshradhakrishnan/malayalam_news