Closed JBloodless closed 1 year ago
Hi, currently it is not supported to export the full model directly to onnx. You would need to rewrite it so that the model is only doing fully real valued processing. Or use the three sub-models that can be exported via onnx.
Ok, got it, thanks
With --export_full=False
I'm getting
Traceback (most recent call last):
File "/data/code_jb/deepfilter2_git/DeepFilterNet/df/export.py", line 331, in <module>
main(args)
File "/data/code_jb/deepfilter2_git/DeepFilterNet/df/export.py", line 296, in main
export(
File "/home/i.beskrovnyy/miniconda3/envs/df/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/data/code_jb/deepfilter2_git/DeepFilterNet/df/export.py", line 193, in export
e0, e1, e2, e3, emb, c0, lsnr = export_impl(
File "/data/code_jb/deepfilter2_git/DeepFilterNet/df/export.py", line 100, in export_impl
model = torch.jit.script(model, example_inputs=[tuple(a for a in inputs)])
File "/home/i.beskrovnyy/miniconda3/envs/df/lib/python3.10/site-packages/torch/jit/_script.py", line 1286, in script
return torch.jit._recursive.create_script_module(
File "/home/i.beskrovnyy/miniconda3/envs/df/lib/python3.10/site-packages/torch/jit/_recursive.py", line 458, in create_script_module
return create_script_module_impl(nn_module, concrete_type, stubs_fn)
File "/home/i.beskrovnyy/miniconda3/envs/df/lib/python3.10/site-packages/torch/jit/_recursive.py", line 524, in create_script_module_impl
create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs)
File "/home/i.beskrovnyy/miniconda3/envs/df/lib/python3.10/site-packages/torch/jit/_recursive.py", line 375, in create_methods_and_properties_from_stubs
concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
RuntimeError: Unsupported value kind: Tensor
even with default model.
How exactly do you call export.py? What args do you use?
python export.py -m DeepFilterNet2 /data/checkpoints_ivan/df2_ll_onnx --export_full False
If I change the model to path to retrained, the error will be the same
If I change jit to all False in export_impl
, then I'm getting all mismatched elements in encoder:
2022-11-10 14:47:12 | WARNING | DF | Elements not close for e0:
Not equal to tolerance rtol=1e-06, atol=1e-05
Mismatched elements: 136057 / 204800 (66.4%)
Max absolute difference: 1.6712724 [28/1458]
Max relative difference: 31671.553
x: array([[[0. , 0. , 0. , ..., 0. , 0. ,
0. ],
[0. , 0. , 0. , ..., 0.01213 , 0. ,...
y: array([[[2.185955e-02, 0.000000e+00, 0.000000e+00, ..., 0.000000e+00,
0.000000e+00, 0.000000e+00],
[0.000000e+00, 0.000000e+00, 0.000000e+00, ..., 0.000000e+00,...
2022-11-10 14:47:12 | WARNING | DF | Elements not close for e1:
Not equal to tolerance rtol=1e-06, atol=1e-05
Mismatched elements: 73908 / 102400 (72.2%)
Max absolute difference: 1.3576229
Max relative difference: 38430.625
x: array([[[0.056033, 0.017943, 0.017943, ..., 0.036259, 0. ,
0. ],
[0. , 0.735084, 0. , ..., 0.11884 , 0. ,...
y: array([[[0. , 0. , 0. , ..., 0. , 0. ,
0.269067],
[0. , 0. , 0.00842 , ..., 0. , 0. ,...
2022-11-10 14:47:12 | WARNING | DF | Elements not close for e2:
Not equal to tolerance rtol=1e-06, atol=1e-05
Mismatched elements: 38405 / 51200 (75%)
Max absolute difference: 0.8311711
Max relative difference: 22806.361
x: array([[[0. , 0.071877, 0.014084, ..., 0.151685, 0.097902,
0.085669],
[0. , 0. , 0. , ..., 0.201829, 0.064229,...
y: array([[[0. , 0. , 0.14058 , ..., 0. , 0. ,
0. ],
[0. , 0. , 0.125281, ..., 0. , 0.030466,...
2022-11-10 14:47:12 | WARNING | DF | Elements not close for e3:
Not equal to tolerance rtol=1e-06, atol=1e-05
Mismatched elements: 38806 / 51200 (75.8%)
Max absolute difference: 4.9555507
Max relative difference: 7891.4126
x: array([[[0. , 0.274603, 0.610629, ..., 0. , 0.735358,
0. ],
[0. , 0. , 0.508879, ..., 0. , 0.493207,...
y: array([[[0. , 0. , 0. , ..., 0. , 0. ,
0. ],
[1.235723, 1.584068, 1.066446, ..., 2.749812, 2.478191,...
2022-11-10 14:47:12 | WARNING | DF | Elements not close for emb:
Not equal to tolerance rtol=1e-06, atol=1e-05
Mismatched elements: 25501 / 25600 (99.6%)
Max absolute difference: 1.7051635
Max relative difference: 18425.525
x: array([[-0.530415, 0.087793, -0.143854, ..., 0.054233, 0.745872,
0.538596],
[ 0.158454, 0.067803, 0.154594, ..., 0.080428, 0.929604,...
y: array([[ 0.207319, 0.010258, -0.216485, ..., 0.028774, 0.081924,
-0.667745],
[-0.235693, 0.010745, 0.713871, ..., 0.066224, 0.5237 ,...
2022-11-10 14:47:12 | WARNING | DF | Elements not close for lsnr:
Not equal to tolerance rtol=1e-06, atol=1e-05
Mismatched elements: 100 / 100 (100%)
Max absolute difference: 9.594181
Max relative difference: 0.7053432
x: array([ -4.007965, -9.974195, -13.913005, -13.674997, -13.577035,
-14.126597, -14.221871, -12.456777, -13.813079, -14.568176,
-13.951439, -13.464954, -12.111427, -13.188027, -13.694571,...
y: array([-13.602146, -14.349119, -14.787824, -14.694757, -14.748948,
-14.843171, -14.871448, -14.788654, -14.894756, -14.918068,
-14.885553, -14.82298 , -14.579944, -14.797973, -14.873879,...
but strangely not in the erb_decoder
, which also had jit=True
by default
I cannot reproduce your issues. For me it runs fine:
$ python DeepFilterNet/df/scripts/export.py -m DeepFilterNet2 /tmp/export
Namespace(model_base_dir='DeepFilterNet2', pf=False, output_dir=None, log_level='INFO', epoch='best', version=False, export_dir='/tmp/export', check=True, simplify=False, opset=12)
2022-11-17 09:32:28 | INFO | DF | Running on torch 1.14.0.dev20221026
2022-11-17 09:32:28 | INFO | DF | Running on host T480s
2022-11-17 09:32:28 | INFO | DF | Git commit: 2ae7883, branch: main
2022-11-17 09:32:28 | INFO | DF | Loading model settings of DeepFilterNet2
2022-11-17 09:32:28 | INFO | DF | Using DeepFilterNet2 model at /home/hendrik/.cache/DeepFilterNet/DeepFilterNet2
2022-11-17 09:32:28 | INFO | DF | Initializing model `deepfilternet2`
2022-11-17 09:32:28 | INFO | DF | Found checkpoint /home/hendrik/.cache/DeepFilterNet/DeepFilterNet2/checkpoints/model_96.ckpt.best with epoch 96
2022-11-17 09:32:28 | INFO | DF | Running on device cpu
2022-11-17 09:32:28 | INFO | DF | Model loaded
2022-11-17 09:32:29 | INFO | DF | Exporting model 'enc' to /tmp/export
2022-11-17 09:32:29 | INFO | DF | Input shapes: {'feat_erb': torch.Size([1, 1, 100, 32]), 'feat_spec': torch.Size([1, 2, 100, 96])}
2022-11-17 09:32:29 | INFO | DF | Output shapes: {'e0': torch.Size([1, 64, 100, 32]), 'e1': torch.Size([1, 64, 100, 16]), 'e2': torch.Size([1, 64, 100, 8]), 'e3': torch.Size([1, 64, 100, 8]), 'emb': torch.Size([1, 100, 256]), 'c0': torch.Size([1, 64, 100, 96]), 'lsnr': torch.Size([1, 100, 1])}
2022-11-17 09:32:29 | INFO | DF | Dynamic axis: {'feat_erb': {2: 'S'}, 'feat_spec': {2: 'S'}, 'e0': {2: 'S'}, 'e1': {2: 'S'}, 'e2': {2: 'S'}, 'e3': {2: 'S'}, 'emb': {1: 'S'}, 'c0': {2: 'S'}, 'lsnr': {1: 'S'}}
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/utils.py:823: UserWarning: no signature found for <torch.ScriptMethod object at 0x7f91f96d8450>, skipping _decide_input_format
warnings.warn(f"{e}, skipping _decide_input_format")
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/_internal/jit_utils.py:258: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /opt/conda/conda-bld/pytorch_1666768144987/work/torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/symbolic_opset9.py:4377: UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with GRU can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model.
warnings.warn(
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/_internal/jit_utils.py:258: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at /opt/conda/conda-bld/pytorch_1666768144987/work/torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1891.)
_C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/utils.py:687: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /opt/conda/conda-bld/pytorch_1666768144987/work/torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/utils.py:687: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at /opt/conda/conda-bld/pytorch_1666768144987/work/torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1891.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/utils.py:1178: UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /opt/conda/conda-bld/pytorch_1666768144987/work/torch/csrc/jit/passes/onnx/constant_fold.cpp:179.)
_C._jit_pass_onnx_graph_shape_type_inference(
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/utils.py:1178: UserWarning: The shape inference of prim::Constant type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. (Triggered internally at /opt/conda/conda-bld/pytorch_1666768144987/work/torch/csrc/jit/passes/onnx/shape_type_inference.cpp:1891.)
_C._jit_pass_onnx_graph_shape_type_inference(
2022-11-17 09:32:29 | INFO | DF | Exporting model 'erb_dec' to /tmp/export
2022-11-17 09:32:29 | INFO | DF | Input shapes: {'emb': torch.Size([1, 100, 256]), 'e3': torch.Size([1, 64, 100, 8]), 'e2': torch.Size([1, 64, 100, 8]), 'e1': torch.Size([1, 64, 100, 16]), 'e0': torch.Size([1, 64, 100, 32])}
2022-11-17 09:32:29 | INFO | DF | Output shapes: {'m': torch.Size([1, 100, 32])}
2022-11-17 09:32:30 | INFO | DF | Dynamic axis: {'emb': {1: 'S'}, 'e3': {2: 'S'}, 'e2': {2: 'S'}, 'e1': {2: 'S'}, 'e0': {2: 'S'}, 'm': {2: 'S'}}
/home/hendrik/mambaforge/envs/df/lib/python3.10/site-packages/torch/onnx/utils.py:823: UserWarning: no signature found for <torch.ScriptMethod object at 0x7f91fa1920c0>, skipping _decide_input_format
warnings.warn(f"{e}, skipping _decide_input_format")
2022-11-17 09:32:30 | INFO | DF | Exporting model 'df_dec' to /tmp/export
2022-11-17 09:32:30 | INFO | DF | Input shapes: {'emb': torch.Size([1, 100, 256]), 'c0': torch.Size([1, 64, 100, 96])}
2022-11-17 09:32:30 | WARNING | DF | Number of tensors (2) does not match provided names: ['coefs']
2022-11-17 09:32:30 | INFO | DF | Output shapes: {'coefs': torch.Size([1, 100, 96, 10])}
2022-11-17 09:32:30 | INFO | DF | Dynamic axis: {'emb': {1: 'S'}, 'c0': {2: 'S'}, 'coefs': {1: 'S'}}
For some reason I'm getting the same error even on clean project without any of my modifications. I'll try to isolate this error, because for now error point model = torch.jit.script(model, example_inputs=[tuple(a for a in inputs)])
is too ambiguous.
For now I clone the latest version of Deepfilternet, made fresh conda environment only with install from poetrylock, and I'm still getting this error
(/data/conda/df_exp) i.beskrovnyy@vmsdn1-hosting117:/data/code_jb/backup/DeepFilterNet/DeepFilterNet$ python df/scripts/export.py -m DeepFilterNet2 /tmp/export
Namespace(model_base_dir='DeepFilterNet2', pf=False, output_dir=None, log_level='INFO', epoch='best', version=False, export_dir='/tmp/export', check=True, simplify=False, opset=12)
2022-11-17 16:26:14 | INFO | DF | Running on torch 1.13.0
2022-11-17 16:26:14 | INFO | DF | Running on host vmsdn1-hosting117
2022-11-17 16:26:14 | INFO | DF | Git commit: 2ae7883, branch: main
2022-11-17 16:26:14 | INFO | DF | Loading model settings of DeepFilterNet2
2022-11-17 16:26:14 | INFO | DF | Using DeepFilterNet2 model at /home/i.beskrovnyy/.cache/DeepFilterNet/DeepFilterNet2
2022-11-17 16:26:14 | INFO | DF | Initializing model `deepfilternet2`
2022-11-17 16:26:16 | INFO | DF | Found checkpoint /home/i.beskrovnyy/.cache/DeepFilterNet/DeepFilterNet2/checkpoints/model_96.ckpt.best with epoch 96
2022-11-17 16:26:16 | INFO | DF | Running on device cuda:0
2022-11-17 16:26:16 | INFO | DF | Model loaded
2022-11-17 16:26:17 | INFO | DF | Exporting model 'enc' to /tmp/export
2022-11-17 16:26:17 | INFO | DF | Input shapes: {'feat_erb': torch.Size([1, 1, 100, 32]), 'feat_spec': torch.Size([1, 2, 100, 96])}
2022-11-17 16:26:17 | INFO | DF | Output shapes: {'e0': torch.Size([1, 64, 100, 32]), 'e1': torch.Size([1, 64, 100, 16]), 'e2': torch.Size([1, 64, 100, 8]), 'e3': torch.Size([1, 64, 100, 8]), 'emb': torch.Size([1, 100, 256]), 'c0': torch.Size([1, 64, 100, 96]), 'lsnr': torch.Size([1, 100, 1])}
/data/conda/df_exp/lib/python3.10/site-packages/torch/jit/_script.py:1280: UserWarning: Warning: monkeytype is not installed. Please install https://github.com/Instagram/MonkeyType to enable Profile-Directed Typing in TorchScript. Refer to https://github.com/Instagram/MonkeyType/blob/master/README.rst to install MonkeyType.
warnings.warn("Warning: monkeytype is not installed. Please install https://github.com/Instagram/MonkeyType "
Traceback (most recent call last):
File "/data/code_jb/backup/DeepFilterNet/DeepFilterNet/df/scripts/export.py", line 336, in <module>
main(args)
File "/data/code_jb/backup/DeepFilterNet/DeepFilterNet/df/scripts/export.py", line 302, in main
export(
File "/data/conda/df_exp/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/data/code_jb/backup/DeepFilterNet/DeepFilterNet/df/scripts/export.py", line 193, in export
e0, e1, e2, e3, emb, c0, lsnr = export_impl(
File "/data/code_jb/backup/DeepFilterNet/DeepFilterNet/df/scripts/export.py", line 98, in export_impl
model = torch.jit.script(model, example_inputs=[tuple(a for a in inputs)])
File "/data/conda/df_exp/lib/python3.10/site-packages/torch/jit/_script.py", line 1286, in script
return torch.jit._recursive.create_script_module(
File "/data/conda/df_exp/lib/python3.10/site-packages/torch/jit/_recursive.py", line 476, in create_script_module
return create_script_module_impl(nn_module, concrete_type, stubs_fn)
File "/data/conda/df_exp/lib/python3.10/site-packages/torch/jit/_recursive.py", line 542, in create_script_module_impl
create_methods_and_properties_from_stubs(concrete_type, method_stubs, property_stubs)
File "/data/conda/df_exp/lib/python3.10/site-packages/torch/jit/_recursive.py", line 393, in create_methods_and_properties_from_stubs
concrete_type._create_methods_and_properties(property_defs, property_rcbs, method_defs, method_rcbs, method_defaults)
RuntimeError: Unsupported value kind: Tensor
Now I know the issue:
UserWarning: Warning: monkeytype is not installed. Please install https://github.com/Instagram/MonkeyType to enable Profile-Directed Typing in TorchScript. Refer to https://github.com/Instagram/MonkeyType/blob/master/README.rst to install MonkeyType.
Oh wow, I didn't think that it is a big deal. Now at least clean install works.
Yeah, and now I'm able to export my own custom model. Sorry for my inattentiveness and thanks for your time.
Hi, I have installed monkeytype. However, I still have the same problem.
Hi, I've been trying to use
export.py
function to convert retrained model to single onnx file, but it seems that there are version mismatch of torch and onnx operation. After some modifications I've almost made it work, but now I'm stuck withAm I doing something wrong or the latest Deepfilternet2 modifications was not tested with this converter? Or maybe I should use specific versions of onnxruntime/torch?