richliao / textClassifier

Text classifier for Hierarchical Attention Networks for Document Classification
Apache License 2.0
1.07k stars 379 forks source link

Performance on Yelp 2015 (HAN) #46

Open Karlguo opened 4 years ago

Karlguo commented 4 years ago

I cannot get the same result as their paper said. I used the same dataset (Download link: http://ir.hit.edu.cn/~dytang/paper/emnlp2015/emnlp-2015-data.7z), but can only get 68.5% on yelp 2015 (The paper said they can get 71%), is there any wrong with my parameters? Here are my parameters: vocab_size: 49000 (Byte-Pair-Encoding with 50000 byte pairs; all tokens that appears no less than 5 times) learning_rate: 0.001 max tokens in a sentence: 48 (over 95% sentences are shorter than 48 tokens) max sentences in a document: 32 (over 95% docs are shorter than 32 sentences) word_embedding_size: 300 (pre-trained with word2vec) word_output_size: 128 sentence_output_size: 128 LSTM hidden_dim: 64 LSTM layer_num: 5 dropout_keep_prob: 0.8 (using tf.nn.dropout, add dropout after word_output and sentence_output)