ArdalanM / nlp-benchmarks

129 stars 24 forks source link

MemoryError #8

Closed zjfheart closed 5 years ago

zjfheart commented 5 years ago

Hi I have memoryError when running your code. I am running on ag_news dataset.

parameters: {'gpu': True, 'test_batch_size': 512, 'depth': 9, 'batch_size': 128, 'test_interval': 50, 'iterations': 1000, 'chunk_size': 2048, 'shortcut': False, 'lr': 0.01, 'maxlen': 1024, 'dataset': 'ag_news', 'lr_halve_interval': 100, 'model_folder': '/home/jingfeng/blockchain_proj/nlp/nlp-benchmarks-master/models/VDCNN/imdb', 'seed': 1337, 'last_pooling_layer': 'k-max-pooling', 'shuffle': False, 'class_weights': None} dataset: AgNews, n_classes: 4

My machine is 252G Mem and 60G Swp.

ArdalanM commented 5 years ago

Hi guys,

I see that you load 28377303 samples which is a lot, that is why you get memory error even though you are running on the strong set up :)

I updated master for vdcnn it should run smoothly and faster than before (thanks to lmdb).

  1. Clone the repo

    https://github.com/ArdalanM/nlp-benchmarks.git
  2. Go to vdcnn folder

    cd nlp-benchmarks/src/vdcnn
  3. Train the model

    ./run.sh

Edit the run/sh file to train on other available datasets

'ag_news' 'db_pedia' 'yelp_review' 'yelp_polarity' 'amazon_review' 'amazon_polarity' 'sogu_news' 'yahoo_answer' 'imdb'

Let me know if it works for you guys

zjfheart commented 5 years ago

@ArdalanM Thanks for the updated version. It is much better.

I run it on ag_news. Cannot achieve the reported accuracy stated in the paper. normally 3-6% accuracy gap. It is also strongly dependent on different initialization and learning rate chosen.

I think the original paper may use some magic and tricks they didn't elaborate.

andreabduque commented 5 years ago

I achieved 0.9 accuracy running until 60 epochs for the ag_news, which is very close to the paper. However this number of epochs is very different from the reported 15 epochs the author claimed in the paper.

PS: I also had to add momentum of 0.9 to achieve this, which the code from this repo did not add.

ArdalanM commented 5 years ago

thanks @andreabduque

added momentum and the choice between adam and sgd solver.

Managed to get 90% accuracy with adam