microsoft / ELL

Embedded Learning Library
https://microsoft.github.io/ELL
Other
2.29k stars 295 forks source link

onnx_import.py : AttributeError: 'NoneType' object has no attribute 'output_shapes' #246

Closed hchanon closed 3 years ago

hchanon commented 4 years ago

Hi,

I tried to run the Keyword spotter training tutorial straight from https://github.com/microsoft/ELL/tree/master/docs/tutorials/Training-audio-keyword-spotter-with-pytorch.

I am getting the following error when I reach the onnx import step and GRU128KeywordSpotter.onnx is not created :

C:\test\ellmodeltraining>python C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\onnx_import.py GRU128KeywordSpotter.onnx MainThread [2020-05-22 15:55:31,404] Pre-processing... MainThread [2020-05-22 15:55:31,404] loading the ONNX model from: GRU128KeywordSpotter.onnx MainThread [2020-05-22 15:55:31,406] Loaded ONNX model in 0.001 seconds. MainThread [2020-05-22 15:55:31,406] ONNX IR_version 6 MainThread [2020-05-22 15:55:31,406] ONNX Graph producer: pytorch version 1.5 MainThread [2020-05-22 15:55:31,407] ONNX Graph total len: 1 MainThread [2020-05-22 15:55:31,407] Input input.1 Inputs [] [] Outputs: ['input.1'] [((1, 1, 80), 'channel_row_column')] Attributes: {} MainThread [2020-05-22 15:55:31,407] Shape Shape_0 Inputs ['input.1'] [((1, 1, 80), 'channel_row_column')] Outputs: ['11'] [((1, 1, 80), 'channel_row_column')] Attributes: {} MainThread [2020-05-22 15:55:31,407] Constant Constant_1 Inputs [] [] Outputs: ['12'] [((1,), 'channel')] Attributes: {'tensor': '...'} MainThread [2020-05-22 15:55:31,407] Constant 11 Inputs [] [] Outputs: ['11'] [((3,), 'channel')] Attributes: {'tensor': '...'} MainThread [2020-05-22 15:55:31,408] Gather Gather_2 Inputs ['11', '12'] [((1, 1, 80), 'channel_row_column'), ((1,), 'channel')] Outputs: ['13'] [((1, 80), 'row_column')] Attributes: {'axis': 0} MainThread [2020-05-22 15:55:31,408] Unsqueeze Unsqueeze_3 Inputs ['13'] [((1, 80), 'row_column')] Outputs: ['17'] [((1, 1, 80), 'channel_row_column')] Attributes: {'axes': [0]} Traceback (most recent call last): File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\onnx_import.py", line 95, in main() File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\onnx_import.py", line 91, in main convert(args.input, args.output_directory, args.zip_ell_model, args.step_interval, args.lag_threshold) File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\onnx_import.py", line 42, in convert lag_threshold_msec=lag_threshold) File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\onnx_to_ell.py", line 32, in convert_onnx_to_ell importer_model = converter.load_model(path) File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\lib\onnx_converters.py", line 2004, in load_model return self.set_graph(graph) File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\lib\onnx_converters.py", line 2026, in set_graph node = self.get_converter(onnx_node).convert(onnx_node) File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\lib\onnx_converters.py", line 1321, in convert node = super().convert(node) File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\lib\onnx_converters.py", line 118, in convert node.output_shapes = self.get_output_shapes() File "C:\MLFrameworks\repos\MicrosoftELL\tools\importers\onnx\lib\onnx_converters.py", line 1353, in get_output_shapes input_shapes += [n.output_shapes[0]] AttributeError: 'NoneType' object has no attribute 'output_shapes' <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Note that I tried the whole procedure under Windows and under Ubuntu 18 LTS, getting the same results.

Also, I did get a warning after the training step:

C:\test\ellmodeltraining>python train_classifier.py --architecture GRU --num_layers 2 --dataset . --use_gpu --outdir . Loading .\testing_list.npz... Loaded dataset testing_list.npz and found sample rate 16000, audio_size 512, input_size 80, window_size 40 and shift 40 Loading .\training_list.npz... Loaded dataset training_list.npz and found sample rate 16000, audio_size 512, input_size 80, window_size 40 and shift 40 Loading .\validation_list.npz... Loaded dataset validation_list.npz and found sample rate 16000, audio_size 512, input_size 80, window_size 40 and shift 40 Training model GRU128KeywordSpotter.pt Training 2 layer GRU 128 using 46017 rows of featurized training input... RMSprop ( Parameter Group 0 alpha: 0 centered: False eps: 1e-08 lr: 0.001 momentum: 0 weight_decay: 1e-05 ) Epoch 0, Loss 1.863, Validation Accuracy 44.340, Learning Rate 0.001 Epoch 1, Loss 1.077, Validation Accuracy 73.880, Learning Rate 0.001 Epoch 2, Loss 0.547, Validation Accuracy 83.181, Learning Rate 0.001 ... Epoch 28, Loss 0.177, Validation Accuracy 90.109, Learning Rate 0.001 Epoch 29, Loss 0.040, Validation Accuracy 90.802, Learning Rate 0.001 Trained in 240.88 seconds Training accuracy = 98.357 % Evaluating GRU keyword spotter using 6835 rows of featurized test audio... Saving evaluation results in '.\results.txt' Testing accuracy = 91.775 % saving onnx file: GRU128KeywordSpotter.onnx

C:\anaconda3\lib\site-packages\torch\onnx\symbolic_opset9.py:1577: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with GRU can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model. "or define the initial states (h0/c0) as inputs of the model. ")

graph(%input.1 : Float(1, 1, 80), %hidden2keyword.weight : Float(31, 128), %hidden2keyword.bias : Float(31), %83 : Long(1), %84 : Long(1), %102 : Float(1, 384, 80), %103 : Float(1, 384, 128), %104 : Float(1, 768), %105 : Long(1), %106 : Long(1), %124 : Float(1, 384, 128), %125 : Float(1, 384, 128), %126 : Float(1, 768)): %11 : Tensor = onnx::Shape(%input.1) %12 : Tensor = onnx::Constant[value={1}]() %13 : Long() = onnx::Gather[axis=0](%11, %12) # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:710:0 %17 : Tensor = onnx::Unsqueezeaxes=[0] %19 : Tensor = onnx::Concat[axis=0](%83, %17, %84) %20 : Float(1, 1, 128) = onnx::ConstantOfShapevalue={0} # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:718:0 %21 : Tensor? = prim::Constant() %42 : Tensor, %43 : Float(1, 1, 128) = onnx::GRU[hidden_size=128, linear_before_reset=1](%input.1, %102, %103, %104, %21, %20) # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:727:0 %44 : Float(1, 1, 128) = onnx::Squeezeaxes=[1] # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:727:0 %45 : Tensor = onnx::Shape(%44) %46 : Tensor = onnx::Constant[value={1}]() %47 : Long() = onnx::Gather[axis=0](%45, %46) # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:710:0 %51 : Tensor = onnx::Unsqueezeaxes=[0] %53 : Tensor = onnx::Concat[axis=0](%105, %51, %106) %54 : Float(1, 1, 128) = onnx::ConstantOfShapevalue={0} # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:718:0 %55 : Tensor? = prim::Constant() %76 : Tensor, %77 : Float(1, 1, 128) = onnx::GRU[hidden_size=128, linear_before_reset=1](%44, %124, %125, %126, %55, %54) # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:727:0 %78 : Float(1, 1, 128) = onnx::Squeezeaxes=[1] # C:\anaconda3\lib\site-packages\torch\nn\modules\rnn.py:727:0 %79 : Tensor = onnx::Sliceaxes=[0], ends=[9223372036854775807], starts=[-1] %80 : Float(1, 128) = onnx::Squeezeaxes=[0] # train_classifier.py:459:0 %81 : Float(1, 31) = onnx::Gemm[alpha=1., beta=1., transB=1](%80, %hidden2keyword.weight, %hidden2keyword.bias) # C:\anaconda3\lib\site-packages\torch\nn\functional.py:1610:0 %82 : Float(1, 31) = onnx::LogSoftmaxaxis=1 # C:\anaconda3\lib\site-packages\torch\nn\functional.py:1535:0 return (%82)

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Could it be a Pytorch or Onnx version issue?

My modules versions are (freshly updated) : torch =1.5.0 onnx = 1.1.1 bumpy = 1.14.3 Python = 3.6.5

hchanon commented 4 years ago

Hi,

I spent a little time on the issue and here is what I learned. It seems that the Onnx library and format has evolved a lot since this ELL project has been updated. My understanding is that the Onnx to ELL converter scripts would need some update to support the current opset. On the lastest python updates, the scripts produce a Onnx file that uses opset 9 while it appears the Onnx to ELL was design to handle opset 6. I did try "torch.onnx.export(self, dummy_input, name, verbose=True, opset_version=7)" but still ran into issue. Sadly, the latest Onnx package does not support opset 6 or lower. From there, it could be possible to downgrade python package, ... ... or the Onnx to ELL could be updated. I did some progress on that but stopped after figuring out that the amount of work required was bigger then what I could invest in this project.

If you plan to invest time in this project, be aware of this situation.

lovettchris commented 3 years ago

This is fixed by this commt.