brightmart / text_classification

all kinds of text classification models and more with deep learning
MIT License
7.87k stars 2.57k forks source link

Getting the data is a big burden #132

Open rbaral opened 5 years ago

rbaral commented 5 years ago

Thanks for sharing the model. I was interested to run your models with the mentioned data however it was not possible. I spent quite a lot time to get the data. Getting the Baidu app and the data from it was a nightmare. I also tried to preprocess the data as you mentioned but there are many other dependencies. It seems that even in preprocessing, some intermediate/temporary files are used and these files are only available in the Baidu network. I signed up for the Baidu app but it does not recognizes non-chines phone number. Tried a lot and gave up. Is it possible to host the data somewhere else?