Open BowenBao opened 1 year ago
Doesn't have bandwidth to investigate ORT cuda kernel number mismatch.
Will unblock bench by running in CPU.
Hi, I wanted to know if there are updates regarding this issue. I faced it too on my side and couldn't find a way to solve it.
I'm exporting a standard nnUNet model (https://github.com/MIC-DKFZ/nnUNet/tree/master/nnunetv2) in onnx. When I run inference on CPU everything works perfectly. However, i have this error when I run it on GPU.
" RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/providers/cuda/nn/batch_norm.h:43 onnxruntime::cuda::BatchNorm
Hi @cabinader, we are working on a solution to solve it from exporter side. This is the tracking issue for it https://github.com/microsoft/onnxscript/issues/1262
Please note that this fix will only be available on the new dynamo based onnx exporter via torch.onnx.dynamo_export
api.
Thanks @BowenBao for your answer. Just a last question. Do you have an estimated timeline in mind for this release of the new onnx exporter ? Many thanks !
Pull request in pytorch https://github.com/pytorch/pytorch/pull/120866.
It should be available in the next pytorch release (2.3) planned around April/May, and soon in nightly.
is this resolved?
From bench
[ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/providers/cuda/nn/batch_norm.h:43 onnxruntime::cuda::BatchNorm::BatchNorm(const onnxruntime::OpKernelInfo&) [with T = float] !(is_trainingmode && opset >= 14) was false. Training mode does not support BN opset 14 (or higher) yet.
Bench is run in eval mode. This is the same issue as https://github.com/pytorch/pytorch/issues/75252 from old exporter. We need to revisit how
training
attribute should be handled, specifically for BatchNorm, and instanceNorm.