cjiang2 / VDCNN

Implementation of Very Deep Convolutional Neural Network for Text Classification
171 stars 41 forks source link

accuracy question when running vdcnn29 #2

Closed wangtao1321 closed 3 years ago

wangtao1321 commented 6 years ago

hello, while running vdcnn9 use AG dataset without k maxpooling,I got result same as the paper(89.83%) but when goes to vdcnn29 with k maxpooling,I only got accuracy 90.4% while the paper report 91.27% I want to know the acc when you went through this network,thanks!

wangtao1321 commented 6 years ago

another problem is that when I run yelp_review_full_csv or sougou,some error occurs that: Loading data... Traceback (most recent call last): File "train.py", line 40, in <module> train_data, train_label, test_data, test_label = data_helper.load_dataset(FLAGS.database_path) File "/home/wangtao/VDCNN/data_helper.py", line 48, in load_dataset train_data, train_label = load_csv_file(dataset_path+'train.csv', num_classes) File "/home/wangtao/VDCNN/data_helper.py", line 26, in load_csv_file text = row['fields'][1].lower() IndexError: list index out of range

cjiang2 commented 6 years ago
  1. Since yelp_review_full has one less column than ag's news and miss the class.txt also, therefore a little modification to data_helper should be done. Something like this: line 26 in data_helper: text = row['fields'][1].lower() -> text = row['fields'][0].lower()

  2. Tensorflow doesn't support an official k-maxpooling operation, therefore it has to be done with native implementation.(Hence the last k-maxpooling implementation I am currently using) According to the original paper, there should be three types of downsampling methods: maxpooling, k-maxpooling, convolutions with stride 2. And all of them should be appearing between blocks. So actually my current implementation is not correct for k-maxpooling and that's why it's not getting a satisfying result. :(

  3. There are still some other missing components. For example, the paper states that they use a different types of batch normalizations while my current implementation is only using batch norm function that Tensorflow provides. Also between frameworks there are chances for a 1% accuracy difference. Anyways, it is usually hard to recreate a whole experiment given different frameworks, implementations and tons of other factors. And hopefully Facebook can release his implementation. :(

wangtao1321 commented 6 years ago

@zonetrooper32 ,thanks for your reply,I want to know where differents between native implementation of k-maxpooling and the k-maxpooling used in vdcnn paper,I supposed that k-maxpooling is exactly top-k operations Could you tell me what the differents between them?

cjiang2 commented 6 years ago

@wangtao1321 Hi, sorry for taking so long to reply, it's been really busy for me recently.

The current implementation does not support k-maxpooling. The old implementation only uses max-pooling. However, I'm rewriting the whole implementation to add all the three down-sampling methods, including k-maxpooling.

Defining k-maxpooling, according to the paper, "Ki is followed by a k-max pooling layer where k is such that the resolution is halved". So the point is, we extract k most important features while the resolution is halved. So it can be done simply by top-k operations, with a little hacky trick to calculate a proper k.

I'll be updating the code once I'm done with my term tests. :(

akhileshkumargangwar commented 5 years ago

@wangtao1321 Hello, I have used Ag Data set but I am getting low accuracy and model seems over fitted. Is there any changes required in code. Thanks

zhongzhh8 commented 5 years ago

@akhileshkumargangwar I have the error of a constantly 25 acc in AF_NEWS and do not know how to solve this problem. How did you solve this problem? Please help me.