pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
https://pytorch.org
Other
83.36k stars 22.48k forks source link

Tacotron pytorch model to ONNX not exporting ? #19951

Closed MuruganR96 closed 5 years ago

MuruganR96 commented 5 years ago

RuntimeError: Expected object of backend CPU but got backend CUDA for argument #3 'index'

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

RuntimeError: $ Torch: not enough memory: you tried to allocate 2442GB. Buy new RAM! at /pytorch/aten/src/TH/THGeneral.cpp:201

i try to export pytorch tacotron pretrained model to onnx model. it is throwing this three kind of issues.

To Reproduce


Traceback (most recent call last):
  File "export_model.py", line 48, in <module>
    torch.onnx.export(model, y, "tts.onnx")
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/__init__.py", line 27, in export
    return utils.export(*args, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 104, in export
    operator_export_type=operator_export_type)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 281, in _export
    example_outputs, propagate)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 224, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 192, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/jit/__init__.py", line 197, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace)(*args, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/jit/__init__.py", line 252, in forward
    out = self.inner(*trace_inputs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/TTS/models/tacotron.py", line 35, in forward
    inputs = self.embedding(characters)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 118, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/functional.py", line 1454, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of backend CPU but got backend CUDA for argument #3 'index'

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 260, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply
    module._apply(fn)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply
    module._apply(fn)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 117, in _apply
    self.flatten_parameters()
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 113, in flatten_parameters
    self.batch_first, bool(self.bidirectional))
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Traceback (most recent call last):
  File "export_model.py", line 48, in <module>
    torch.onnx.export(model, y, "tts.onnx")
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/__init__.py", line 27, in export
    return utils.export(*args, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 104, in export
    operator_export_type=operator_export_type)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 281, in _export
    example_outputs, propagate)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 224, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/onnx/utils.py", line 192, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/jit/__init__.py", line 197, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace)(*args, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/jit/__init__.py", line 252, in forward
    out = self.inner(*trace_inputs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/TTS/models/tacotron.py", line 35, in forward
    inputs = self.embedding(characters)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 118, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/admin1/speech_recognition/text2speech/new_one/env/lib/python3.6/site-packages/torch/nn/functional.py", line 1454, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: $ Torch: not enough memory: you tried to allocate 2442GB. Buy new RAM! at /pytorch/aten/src/TH/THGeneral.cpp:201

Expected behavior

from torch.autograd import Variable import torch.onnx

from TTS.models.tacotron import Tacotron from TTS.layers import from TTS.utils.data import from TTS.utils.audio import AudioProcessor from TTS.utils.generic_utils import load_config from TTS.utils.text import text_to_sequence from TTS.utils.synthesis import synthesis from TTS.utils.visual import visualize from utils.text.symbols import symbols, phonemes

os.environ["CUDA_VISIBLE_DEVICES"]="0, 1"

MODEL_PATH = 'best_model.pth.tar' CONFIG_PATH = '/config.json' OUT_FOLDER = '/test' CONFIG = load_config(CONFIG_PATH) use_cuda = True

CONFIG.audio["preemphasis"] = 0.97 ap = AudioProcessor(**CONFIG.audio)

cp = torch.load(MODEL_PATH)

print(cp['step']) num_chars = len(phonemes) if CONFIG.use_phonemes else len(symbols) model = Tacotron(num_chars, CONFIG.embedding_size, CONFIG.audio['num_freq'], CONFIG.audio['num_mels'], CONFIG.r, attn_windowing=False) model.load_state_dict(cp['model']) model.eval() model = Tacotron(num_chars, CONFIG.embedding_size, CONFIG.audio['num_freq'], CONFIG.audio['num_mels'], CONFIG.r, attn_windowing=False) model.load_state_dict(cp['model']) model.eval()

x = Variable(torch.randn(61, 256, 1025, 80, 2)) y = x.long() y = y.cuda() ###dummy input### torch.onnx.export(model, y, "tts.onnx")

========================================================= i was followed pytorch instructions for pytorch model to onnx export.

sir how to resolve this issue.

1) this dummy input - shape problem (or) 2) is it wrong way for exporting model from pytorch to onnx? (or) 3) memory allocating issue?

if memory allocation issue means, in system configuration is,

4 GPUs - each as 24 GB. 190 GB RAM - 80 cores.

sir please help me how to resolve? how can export ONNX model by using pytorch.?

Environment

Collecting environment information...
PyTorch version: 1.0.1.post2
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 18.04.2 LTS
GCC version: (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
CMake version: version 3.12.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: TITAN RTX
GPU 1: TITAN RTX
GPU 2: TITAN RTX
GPU 3: TITAN RTX

Nvidia driver version: 415.27
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.2

Versions of relevant libraries:
[pip3] numpy==1.14.3
[pip3] torch==1.0.1.post2
[pip3] torchvision==0.2.2.post3
[pip3] torchviz==0.0.1
[conda] Could not collect
gchanan commented 5 years ago

This doesn't appear to be a bug; please use https://discuss.pytorch.org (https://discuss.pytorch.org/) for help. Please kindly feel encouraged to reopen this issue, if this does not apply.

MuruganR96 commented 5 years ago

Thank you sir.