awslabs / handwritten-text-recognition-for-apache-mxnet

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.
Apache License 2.0
481 stars 189 forks source link

Downloading gbw dataset crashes machine #39

Closed naveen-marthala closed 4 years ago

naveen-marthala commented 4 years ago

Below code in 'Denoising text ouptut' section of '0_handwriting_ocr.ipynb' file, when run on a machine with 35.35GB RAM and 107.77GB disk space(google colab TPU Session) crashes system for unknown reason.

ctx_nlp = mx.gpu(3)
language_model, vocab = nlp.model.big_rnn_lm_2048_512(dataset_name='gbw', pretrained=True, ctx=ctx_nlp)
moses_tokenizer = nlp.data.SacreMosesTokenizer()
moses_detokenizer = nlp.data.SacreMosesDetokenizer()

How do i download this dataset without crashing the machine? And also, I don't want to download it from next time, so can I save this dataset too?

jonomon commented 4 years ago

What version of gluonnlp and mxnet are you using?

I managed to run the code with gluonnlp 0.91 and MXNet 1.6.0.

Also, Gluonnlp will automatically save the models so you won't have to download it again next time.

naveen-marthala commented 4 years ago

I have installed packages with the exact versions you mentioned (I understood that you missed a dot since I got an error that there's no 0.91 version of gluonnlp available, so i installed 0.9.1): !pip install mxnet==1.6.0 mxboard leven gluonnlp==0.9.1 tqdm sacremoses and here's my session crashing again. image and here are my machine's specifications: image Sir, you may be using a more powerful system @jonomon. How do i fix this?

naveen-marthala commented 4 years ago

Or @jonomon , is there any other way I can download that dataset manually and include it in the code?

jonomon commented 4 years ago

@ThomasDelteil

naveen-marthala commented 4 years ago

I have downloaded those files from the urls that are shown when those lines are being run. And did this:

ctx_nlp = mx.gpu(3)
# language_model, vocab = nlp.model.big_rnn_lm_2048_512(dataset_name='gbw', pretrained=True, ctx=ctx_nlp)
language_model = '/content/drive/.../big_rnn_lm_2048_512_gbw-6bb3e991.zip'
vocab = '/content/drive/.../gbw-ebb1a287.zip'
moses_tokenizer = nlp.data.SacreMosesTokenizer()
moses_detokenizer = nlp.data.SacreMosesDetokenizer()

I don't have any problems with these lines now.

naveen-marthala commented 4 years ago

I now have a new problem with generator that uses language_model and vocab from previous lines. This is the error I now have. Is this error from the file? I couldn't figure out what actually caused this error. image

naveen-marthala commented 4 years ago

@jonomon and @ThomasDelteil , will you please be able to look at my last reply and help me fix it.

jonomon commented 4 years ago

What tokenizer are you using?

Could you provide a Minimal, Reproducible Example?

naveen-marthala commented 4 years ago

I have made no changes in "0.handwriting_ocr.ipynb" except in this line below, since my machine was crashing whenever i run it.

ctx_nlp = mx.gpu(3)
# language_model, vocab = nlp.model.big_rnn_lm_2048_512(dataset_name='gbw', pretrained=True, ctx=ctx_nlp)
# pointing to the downloaded files
language_model = '/content/drive/.../big_rnn_lm_2048_512_gbw-6bb3e991.zip'
vocab = '/content/drive/.../gbw-ebb1a287.zip'
moses_tokenizer = nlp.data.SacreMosesTokenizer()
moses_detokenizer = nlp.data.SacreMosesDetokenizer()

The tokenizer I am using is the one that was in the code.

jbuehler1337 commented 4 years ago

Hey Naveen,

  1. Question: Did you check the count of GPUs? My system only has 1 GPU so I had to put ctx_nlp = mx.gpu(mx.context.num_gpus()-1)
  2. Question: When you just put mx.context.num_gpus() into a Jupyter cell, what is the output? I had problems with the mxnet cuda version and this lead to the system trying to use the CPU instead of GPU.
  3. Did you shutdown all other notebook kernels before starting this one?
  4. Is your SWAP of the Linux system big enough in case the RAM is getting full (my system needed 38 GB of total RAM to compute it ) --> I enlarged the SWAP to 32 GB so i have a total of 64 GB in theory. BR Jan
naveen-marthala commented 4 years ago

Hey Naveen,

  1. Question: Did you check the count of GPUs? My system only has 1 GPU so I had to put ctx_nlp = mx.gpu(mx.context.num_gpus()-1)

I am using Google Colab with TPU.

  1. Question: When you just put mx.context.num_gpus() into a Jupyter cell, what is the output? I had problems with the mxnet cuda version and this lead to the system trying to use the CPU instead of GPU.

The output is 0. But, I do have TPU though.

  1. Did you shutdown all other notebook kernels before starting this one?

Yes, this is the only notebook I have open.

  1. Is your SWAP of the Linux system big enough in case the RAM is getting full (my system needed 38 GB of total RAM to compute it ) --> I enlarged the SWAP to 32 GB so i have a total of 64 GB in theory.

I only have 35.35GB of RAM. I couldn't understand rest of your question.

BR Jan

ghost commented 1 year ago

I have downloaded those files from the urls that are shown when those lines are being run. And did this:

ctx_nlp = mx.gpu(3)
# language_model, vocab = nlp.model.big_rnn_lm_2048_512(dataset_name='gbw', pretrained=True, ctx=ctx_nlp)
language_model = '/content/drive/.../big_rnn_lm_2048_512_gbw-6bb3e991.zip'
vocab = '/content/drive/.../gbw-ebb1a287.zip'
moses_tokenizer = nlp.data.SacreMosesTokenizer()
moses_detokenizer = nlp.data.SacreMosesDetokenizer()

I don't have any problems with these lines now.

ghost commented 1 year ago

bro if u have download the dataset can u pls send link of that dataset