sloria / TextBlob

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
https://textblob.readthedocs.io/
MIT License
9.16k stars 1.15k forks source link

MemoryError when training NaiveBayesClassifier #248

Open cmazzoni87 opened 5 years ago

cmazzoni87 commented 5 years ago

I am encountering a memory issue when trying to train my model with large data-sets (140k samples of 200 words each). Is there a way to train the model incrementally?

Akshay2350 commented 5 years ago

First do some per-processing and remove unnecessary words from your dataset. In per-processing i would say remove stopwords, punctuation etc. Make it look more similar to each other. Then if possible use online platforms like Kaggle, Google colabs etc. They are cloud platforms with capablity of processing higher system requirement ML models. If still have memory error then go for below solutions.

I know its not enough but there is no other way i know to work with big data.