Closed 60999 closed 4 years ago
you can try reducing hparams.mcn . Although, it shouldn't be a problem if you have a 32gb GPU. I am able to run on colab K-80 GPU too, for batch size 1. Changing batch_size, mcn etc. won't work. Because you are not able to load initial model(that is independent of batch size) on your GPU. Please let me know the n_speakers as a distribution is made for each speaker, which could have led to the problem.@60999
Sorry to bother you, Under v100-sxm2 GPU 32g,It always appears `python train.py --output_directory=outdir/ --log_directory=logdir/ -c tacotron2_statedict.pt --warm_start WARNING:tensorflow: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:
FP16 Run: False Dynamic Loss Scaling: True Distributed Run: False cuDNN Enabled: True cuDNN Benchmark: False Traceback (most recent call last): File "train.py", line 292, in
args.warm_start, args.n_gpus, args.rank, args.group_name, hparams)
File "train.py", line 169, in train
model = load_model(hparams)
File "train.py", line 74, in load_model
model = Tacotron2(hparams).cuda()
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 265, in cuda
return self._apply(lambda t: t.cuda(device))
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 199, in _apply
param.data = fn(param.data)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/torch/nn/modules/module.py", line 265, in
return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: out of memory`
I modified hparams.py batch_size=1 But the mistake remains.
Is this right? How to repair it,please?