NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
Apache License 2.0
12.07k stars 2.51k forks source link

ONNX Export NoneType error #224

Closed carlfm01 closed 4 years ago

carlfm01 commented 4 years ago

Hello again, I'm trying to export quartznet15x5 v2 to ONNX with master(f946aca)

With the following command:

!python export_jasper_to_onnx.py --config quartznet15x5.yaml  \
--nn_encoder JasperEncoder-STEP-247400.pt --nn_decoder JasperDecoderForCTC-STEP-247400.pt  \
--onnx_encoder encoder.onnx --onnx_decoder decoder.onnx

Failing with:

Loading config file...
Determining model shape...
  Num encoder input features: 64
  Num decoder input features: 1024
Initializing models...
Loading checkpoints...
Exporting encoder...
2019-12-15 06:57:12,846 - WARNING - Module is JasperEncoder. We are removinginput and output length ports since they are not needed for deployment
2019-12-15 06:57:12,847 - WARNING - Turned off 0 masked convolutions
2019-12-15 06:57:12,848 - ERROR - ERROR: module export failed for JasperEncoder with exception 'NoneType' object has no attribute 'to'
Exporting decoder...
graph(%encoder_output : Float(1, 1024, 128),
      %1 : Float(29),
      %2 : Float(29, 1024, 1)):
  %3 : Float(1, 29, 128) = onnx::Conv[dilations=[1], group=1, kernel_shape=[1], pads=[0, 0], strides=[1]](%encoder_output, %2, %1), scope: JasperDecoderForCTC/Sequential[decoder_layers]/Conv1d[0] # /usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py:202:0
  %4 : Float(1, 128, 29) = onnx::Transpose[perm=[0, 2, 1]](%3), scope: JasperDecoderForCTC # /usr/local/lib/python3.6/dist-packages/nemo_asr/jasper.py:207:0
  %output : Float(1, 128, 29) = onnx::LogSoftmax[axis=2](%4), scope: JasperDecoderForCTC # /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:1317:0
  return (%output)

/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py:772: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input encoder_output
  'Automatically generated names will be applied to each dynamic axes of input {}'.format(key))
/usr/local/lib/python3.6/dist-packages/torch/onnx/utils.py:772: UserWarning: No names were found for specified dynamic axes of provided input.Automatically generated names will be applied to each dynamic axes of input output
  'Automatically generated names will be applied to each dynamic axes of input {}'.format(key))
Export completed successfully.

Is ONNX export compatible with the latest model using master?

okuchaiev commented 4 years ago

Where did you get the quartznet15x5 checkpoint from? The ones we have published are compatible with the latest stable version of NeMo (0.8.2). But 0.8.2 does not have export feature (which is part of 0.9). We are going to publish 0.9 (very) soon together with updated checkpoints. I'll revisit this issue then.

carlfm01 commented 4 years ago

Where did you get the quartznet15x5 checkpoint from?

From: https://ngc.nvidia.com/catalog/models/nvidia:quartznet15x5/version there's a tag for v2

okuchaiev commented 4 years ago

had a silly bug in when trying to turn off masked convolutions. See https://github.com/NVIDIA/NeMo/pull/232

okuchaiev commented 4 years ago

Closing this issue since https://github.com/NVIDIA/NeMo/pull/232 was merged to master. Please re-open if the bug persists

ggrunin commented 4 years ago

@okuchaiev Same error with the latest v0.9 checkpoint: https://ngc.nvidia.com/catalog/models/nvidia:quartznet_15x5_ls_sp and v0.9 image nvcr.io/nvidia/nemo:v0.9

okuchaiev commented 4 years ago

@ggrunin yes, 0.9 (which you get from pip) has this issue. You can apply this fix https://github.com/NVIDIA/NeMo/pull/232 (or just take this github commit)

ggrunin commented 4 years ago

@okuchaiev Thanks for the prompt reply. I tired using latest code and getting: ERROR - ERROR: module export failed for JasperEncoder with exception number of output names provided (2) exceeded number of outputs (1)

yingxingdechibang commented 4 years ago

@carlfm01 Hello, I used Nemo-0.9, which has exactly the same error as yours. "module export failed for JasperEncoder with exception 'NoneType' object has no attribute 'to'". So I wonder finally how did you solve the problem? Thanks~

carlfm01 commented 4 years ago

Hello @yingxingdechibang, using master for the export should work.

or try the exact commit: https://github.com/NVIDIA/NeMo/tree/f69487170a434329f9e7560bc093695ab9b2e2e6

okuchaiev commented 4 years ago

@ggrunin do you still have this issue? does it happen for you on the latest master?

ggrunin commented 4 years ago

@okuchaiev I rebuild the mage yesterday with the current master branch - still having: ERROR: module export failed for JasperEncoder with exception number of output names provided (2) exceeded number of outputs (1).I'm using examples/asr/configs/quartznet15x5.yaml configuration and the latest quartznet15x5 checkpoints (1/08/2020) from NVIDIA cloud (https://ngc.nvidia.com/catalog/models/nvidia:quartznet_15x5_ls_sp).

okuchaiev commented 4 years ago

closing as this is related to the old version