mdangschat / ctc-asr

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.
MIT License
121 stars 36 forks source link

Configuration for low memory GPU #14

Closed wahyubram82 closed 4 years ago

wahyubram82 commented 4 years ago

I use laptop with 2GB GPU Memory (Nvidia MX150).

I try to build new language model, so i try many source code from deepspeech, pytorch, etc...

to make my laptop capable handle the process. i set the another source code with low batch and number of n_hidden. I already try to reduce the batch to 1 and_number units_rnn to 1024, but your code still insufied GPU memory...

do you have any recommendation of the setting?

command that i use: python3 asr/train.py -- --used_model ds2 --rnn_cell rnn_relu --feature_type mfcc --batch_size 1 --max_epochs 15 --cudnn True --allow_vram_growth True --num_units_rnn 1024 --delete tensorboard learning_rate 0.00001

mdangschat commented 4 years ago

TLDR: This project requires tons of memory.

Hi, sadly this project is very demanding on RAM and VRAM. The smallest network that I trained and had a WER of less than ~30% required about 6 GB of VRAM (can't remember how much RAM was used). For a ~12% WER I has to use a V100 with 16 GB of VRAM.

You could set --allow_vram_groth False. In that case TensorFlow should directly raise an error if it could not acquire enough VRAM for the configured network.

Also:

Configuration

The network architecture and training parameters can be configured by adding the appropriate flags or by directly editing the asr/params.py configuration file. The default configuration requires quite a lot of VRAM, consider reducing the number of units per layer (num_units_dense, num_units_rnn) and the amount of RNN layers (num_layers_rnn).

Hope this helps. However I would look for something else if you can only train on your laptop. Training times could exceed a week.