I'm trying to train this model on an EC2 instance running a c5.xlarge (4 vCPUs, 8gb of RAM). After setting up the data and preprocessing, I try to train the Tacotron-2 model. It gets as far as printing "Generated 20 test batches of size 32 in 23.134 sec" and then hangs. At this point I've tried sshing from a different terminal but that always freezes, and eventually the training terminal prints "Killed". I'm guessing it's probably bc of the limited resources on the machine. Is there anyway to get around this?
I'm trying to train this model on an EC2 instance running a c5.xlarge (4 vCPUs, 8gb of RAM). After setting up the data and preprocessing, I try to train the Tacotron-2 model. It gets as far as printing "Generated 20 test batches of size 32 in 23.134 sec" and then hangs. At this point I've tried sshing from a different terminal but that always freezes, and eventually the training terminal prints "Killed". I'm guessing it's probably bc of the limited resources on the machine. Is there anyway to get around this?