NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
BSD 3-Clause "New" or "Revised" License
5.04k stars 1.37k forks source link

CUDA out of memory. #473

Open zipperblues opened 3 years ago

zipperblues commented 3 years ago

I have Colab Pro, I don't have any other Tacotron or Colab Notebooks in my drive, I'm only uploading 30 wavs, I have adjusted the batch and epoch size from 30>10 and 500>50, the audio files are mono and 22050 Hz. What is going on? What am I doing wrong? I cannot get it to train no matter how many times I try.

Here is my error log.

`FP16 Run: False Dynamic Loss Scaling: True Distributed Run: False cuDNN Enabled: True cuDNN Benchmark: False

RuntimeError Traceback (most recent call last)

in () 5 print('cuDNN Benchmark:', hparams.cudnn_benchmark) 6 train(output_directory, log_directory, checkpoint_path, ----> 7 warm_start, n_gpus, rank, group_name, hparams, log_directory2) 9 frames in train(output_directory, log_directory, checkpoint_path, warm_start, n_gpus, rank, group_name, hparams, log_directory2) 238 torch.cuda.manual_seed(hparams.seed) 239 --> 240 model = load_model(hparams) 241 learning_rate = hparams.learning_rate 242 optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, in load_model(hparams) 120 121 def load_model(hparams): --> 122 model = Tacotron2(hparams).cuda() 123 if hparams.fp16_run: 124 model.decoder.attention_layer.score_mask_value = finfo('float16').min /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in cuda(self, device) 489 Module: self 490 """ --> 491 return self._apply(lambda t: t.cuda(device)) 492 493 def xpu(self: T, device: Optional[Union[int, device]] = None) -> T: /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn) 385 def _apply(self, fn): 386 for module in self.children(): --> 387 module._apply(fn) 388 389 def compute_should_use_set_data(tensor, tensor_applied): /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn) 385 def _apply(self, fn): 386 for module in self.children(): --> 387 module._apply(fn) 388 389 def compute_should_use_set_data(tensor, tensor_applied): /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn) 385 def _apply(self, fn): 386 for module in self.children(): --> 387 module._apply(fn) 388 389 def compute_should_use_set_data(tensor, tensor_applied): /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn) 385 def _apply(self, fn): 386 for module in self.children(): --> 387 module._apply(fn) 388 389 def compute_should_use_set_data(tensor, tensor_applied): /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn) 385 def _apply(self, fn): 386 for module in self.children(): --> 387 module._apply(fn) 388 389 def compute_should_use_set_data(tensor, tensor_applied): /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn) 407 # `with torch.no_grad():` 408 with torch.no_grad(): --> 409 param_applied = fn(param) 410 should_use_set_data = compute_should_use_set_data(param, param_applied) 411 if should_use_set_data: /usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in (t) 489 Module: self 490 """ --> 491 return self._apply(lambda t: t.cuda(device)) 492 493 def xpu(self: T, device: Optional[Union[int, device]] = None) -> T: RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 15.90 GiB total capacity; 14.89 GiB already allocated; 39.75 MiB free; 14.98 GiB reserved in total by PyTorch)` The GPU is "GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-ff8db16d-fbed-114c-94f5-91b30a9cf305)" but I have tried others as well, and still have the same problem.
abaddon-moriarty commented 3 years ago

When that happens it's because the batch-size is too small for the RAM available is, try lower the batch size in train.py until it works. It might be very low depending on how powerful your GPU is. At least that's what worked for me, hope it helps.