Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation
MIT License
2.27k stars 905 forks source link

What params should I tweak in order to prevent the model crashing during training? #485

Open AliBharwani opened 4 years ago

AliBharwani commented 4 years ago

I'm trying to train this model on an EC2 instance running a c5.xlarge (4 vCPUs, 8gb of RAM). After setting up the data and preprocessing, I try to train the Tacotron-2 model. It gets as far as printing "Generated 20 test batches of size 32 in 23.134 sec" and then hangs. At this point I've tried sshing from a different terminal but that always freezes, and eventually the training terminal prints "Killed". I'm guessing it's probably bc of the limited resources on the machine. Is there anyway to get around this?

mib32 commented 4 years ago

@AliBharwani Use GPU, with more RAM, for example g2.2xlarge

MODAK27 commented 4 years ago

reduce batch_size for tacotron training in hparams=16 from 32