What params should I tweak in order to prevent the model crashing during training?

AliBharwani commented 4 years ago

I'm trying to train this model on an EC2 instance running a c5.xlarge (4 vCPUs, 8gb of RAM). After setting up the data and preprocessing, I try to train the Tacotron-2 model. It gets as far as printing "Generated 20 test batches of size 32 in 23.134 sec" and then hangs. At this point I've tried sshing from a different terminal but that always freezes, and eventually the training terminal prints "Killed". I'm guessing it's probably bc of the limited resources on the machine. Is there anyway to get around this?

mib32 commented 4 years ago

@AliBharwani Use GPU, with more RAM, for example g2.2xlarge

MODAK27 commented 4 years ago

reduce batch_size for tacotron training in hparams=16 from 32

Rayhane-mamah / Tacotron-2

What params should I tweak in order to prevent the model crashing during training? #485