Script quits unexpectedly (without errors) when trying to export model to ONNX

miccio-dk commented 2 years ago

First of all, thanks for this invaluable resource :)

As for my issue: I would like to export the NISQA v2 model to ONNX so that I can use for data evaluation within my tensorflow-based environment without introducing pytorch as a dependency (similarly to Microsoft's DNSMOS p.835). However, when attempting to run torch.onnx.export(), the script exits unexpectedly without throwing any error or raising exceptions. This happens no matter which opset version I pick. Do you know if the issue is related certain operations happening within the model?

Here is the code that I used. I just replace the content of export_dim with this and run the prediction script as usual. If you're interested in further debugging the issue, I can upload my fork of the repo with the full script.

  x, y, (idx, n_wins) = ds[0]
  x = x.unsqueeze(0)
  x.requires_grad = True
  n_wins = torch.from_numpy(n_wins).unsqueeze(0)
  #n_wins.requires_grad = True
  model.eval()
  torch.onnx.export(model,                   # model being run
                  (x, n_wins),               # model input (or a tuple for multiple inputs)
                  "nisqa.onnx",              # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=14,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
  )
  print('Done')   # !!! this line never gets executed because the script quits unexpectedly !!!

Thanks in advance

miccio-dk commented 2 years ago

Update: when trying to export the model from Linux (or WSL in my case), I get the following, more verbose errors:

Device: cpu
Model architecture: NISQA_DIM
Loaded pretrained model from weights/nisqa.tar
/home/username/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/onnx/utils.py:1294: UserWarning: Provided key output for dynamic axes is not a valid input/output name
  warnings.warn("Provided key {} for dynamic axes is not a valid input/output name".format(key))
WARNING: The shape inference of prim::PackPadded type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::PackPadded type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
/home/username/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/onnx/symbolic_helper.py:258: UserWarning: ONNX export failed on adaptive_max_pool2d because input size not accessible not supported
  warnings.warn("ONNX export failed on " + op + " because " + msg + " not supported")
/home/username/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/onnx/symbolic_helper.py:716: UserWarning: allowzero=0 by default. In order to honor zero value in shape use allowzero=1
  warnings.warn("allowzero=0 by default. In order to honor zero value in shape use allowzero=1")
WARNING: The shape inference of prim::PadPacked type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::PadPacked type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
Segmentation fault

PS: you can access my script here: https://github.com/miccio-dk/NISQA Notice that run_export.py behaves exactly like run_predict.py except I'm forcing CPU.

miccio-dk commented 2 years ago

Other update: this issue is currently tracked on pytorch repo: https://github.com/pytorch/pytorch/issues/75383

gabrielmittag commented 2 years ago

Thanks, hopefully, someone will be able to help in the PyTorch repo. I haven't converted this particular model to ONNX before but a workaround you could try is to remove the packed sequence parts altogether if this is causing the errors. Actually, that's what I used to do with other models because ONNX didn't use to support packed sequences then.

gabrielmittag commented 2 years ago

Closing this as it is not directly related to the model but rather to PyTorch. BTW - if you want to export it to ONNX you probably need to use the model without adaptive pooling layers and without packed sequences. Then it should work

JBloodless commented 1 year ago

@gabrielmittag do I need to retrain the model after removing packed sequences or there are some kind of workaround here? Stuck on similar problem, only packed sequences are the issue for me

wuzhuohong commented 10 months ago

Update: when trying to export the model from Linux (or WSL in my case), I get the following, more verbose errors:

Device: cpu
Model architecture: NISQA_DIM
Loaded pretrained model from weights/nisqa.tar
/home/username/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/onnx/utils.py:1294: UserWarning: Provided key output for dynamic axes is not a valid input/output name
  warnings.warn("Provided key {} for dynamic axes is not a valid input/output name".format(key))
WARNING: The shape inference of prim::PackPadded type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::PackPadded type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
/home/username/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/onnx/symbolic_helper.py:258: UserWarning: ONNX export failed on adaptive_max_pool2d because input size not accessible not supported
  warnings.warn("ONNX export failed on " + op + " because " + msg + " not supported")
/home/username/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/onnx/symbolic_helper.py:716: UserWarning: allowzero=0 by default. In order to honor zero value in shape use allowzero=1
  warnings.warn("allowzero=0 by default. In order to honor zero value in shape use allowzero=1")
WARNING: The shape inference of prim::PadPacked type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
WARNING: The shape inference of prim::PadPacked type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function.
Segmentation fault

PS: you can access my script here: https://github.com/miccio-dk/NISQA Notice that run_export.py behaves exactly like run_predict.py except I'm forcing CPU.

I wondor if you have solve this problem. Same Problem I met.

gabrielmittag / NISQA

Script quits unexpectedly (without errors) when trying to export model to ONNX #23