jiegzhan / multi-class-text-classification-cnn-rnn

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.
https://www.kaggle.com/c/sf-crime/data
Apache License 2.0
599 stars 262 forks source link

How to change parametrs to much bigger text on russian? #30

Open glorsh66 opened 7 years ago

glorsh66 commented 7 years ago

Thanks for your code it's pretty much exactly what i was looking for. But i need to classify bigger text (around 500 words in one article), and it's gonna be in Russian language. Can you advise how to improve code for this?

What do i need to change in config file? batch_size": 256, "dropout_keep_prob": 0.5, "embedding_dim": 300, "evaluate_every": 100, "filter_sizes": "3,4,5", "hidden_unit": 300, "l2_reg_lambda": 0.0, "max_pool_size": 4, "non_static": false, "num_epochs": 1, "num_filters": 128

What do i need to change for bigger sentences?

jiegzhan commented 7 years ago

You will be able to adjust the training parameters once you understand what are they. http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

glorsh66 commented 7 years ago

Thx for the link!