Configuration for low memory GPU

mdangschat / ctc-asr

End-to-end trained speech recognition system, based on RNNs and the connectionist temporal classification (CTC) cost function.

MIT License

121 stars 36 forks source link

TLDR: This project requires tons of memory.

Hi, sadly this project is very demanding on RAM and VRAM. The smallest network that I trained and had a WER of less than ~30% required about 6 GB of VRAM (can't remember how much RAM was used). For a ~12% WER I has to use a V100 with 16 GB of VRAM.

You could set --allow_vram_groth False. In that case TensorFlow should directly raise an error if it could not acquire enough VRAM for the configured network.

Also:

Configuration

The network architecture and training parameters can be configured by adding the appropriate flags or by directly editing the asr/params.py configuration file. The default configuration requires quite a lot of VRAM, consider reducing the number of units per layer (num_units_dense, num_units_rnn) and the amount of RNN layers (num_layers_rnn).

Hope this helps. However I would look for something else if you can only train on your laptop. Training times could exceed a week.

mdangschat / ctc-asr

Configuration for low memory GPU #14

Configuration