microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.35k stars 2.88k forks source link

onnxruntime on cuda fail to run 3d unet model due to 'ConvTranspose_31 Input X must be 3- or 4-dimensional. X: {1,384,16,16,16} #7756

Open rAum opened 3 years ago

rAum commented 3 years ago

Describe the bug I've installed onnxruntime-gpu version 1.7.0 from pip. I load trained model (it's unet from monai library), converted to onnx, which on cpu works just fine. However when I execute it on GPU (RTX 3090) in a docker (based on nvidia/cuda:11.2.2-cudnn8-runtime-ubuntu20.04) the model fails due to error:

  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running ConvTranspose node. Name:'ConvTranspose_31' Status Message: Input X must be 3- or 4-dimensional. X: {1,384,16,16,16}

I've seen there was a PR adding support for 3D convtranspose for gpu some time ago (https://github.com/microsoft/onnxruntime/pull/6794), which I think should be included in 1.7.0 release and I believe it should work... Can someone help me understand what is going on and is it possible to fix it somehow?

Thanks in advance for your help.

Urgency We have to deliver inference till end of June.

System information

To Reproduce Due to NDA and data being medical records (volumetric CT scans) I cannot provide reproduction data. However we base on monai 3D unet model and there are few CT scans available to provide some data. Check https://docs.monai.io/en/latest/_modules/monai/networks/nets/unet.html And probably maybe model from this could reproduce the issue https://github.com/Project-MONAI/tutorials/blob/master/3d_segmentation/brats_segmentation_3d.ipynb

Expected behavior ConvTranspose works with 5 dimension tensor on GPU (CUDA) (batch, C, H, W, D). Currently it works only on CPU.

Screenshots N/A

Additional context N/A

jywu-msft commented 3 years ago

That PR 6794 just missed the 1.7 release. Try with 1.8 which was just released recently.

rAum commented 3 years ago

I've already tried yesterday, installed 1.8 version and rerun model inference. Instead of the assertion it crashes with segmentation fault, without any errorsor stack trace printed.

czw., 17 cze 2021, 04:30 użytkownik George Wu @.***> napisał:

That PR 6794 just missed the 1.7 release. Try with 1.8 which was just released recently.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/microsoft/onnxruntime/issues/7756#issuecomment-862868039, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAH5OFQTM6B4GJRZDW3IMCLTTFM4TANCNFSM45EWH6AQ .