microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.85k stars 2.94k forks source link

Batch infer occurs NAN #8766

Open powermano opened 3 years ago

powermano commented 3 years ago

Describe the bug Instancenorm does not support batch infer, the results will occur nan.

Urgency

System information

To Reproduce Repalce any onnx model with instancenorm.

import json
import onnx
import onnxruntime
from onnx import numpy_helper
import pkg_resources
import datetime
import numpy as np
import cv2

print("onnx-runtime version:", onnxruntime.__version__)
print("onnx version:", onnx.__version__)

model = "./instacenorm_test.onnx"
img = cv2.imread("test.jpg")

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = np.transpose(img, (2, 0, 1))
img = np.array([img for _ in range(64)])
print(img.shape)
#img = np.expand_dims(img, axis=0)
img = img.astype(np.float32)

use_gpu = True
ctx = 0
if use_gpu:
    session = onnxruntime.InferenceSession(model)
    session.set_providers(['CUDAExecutionProvider'], [{'device_id': ctx}])
else:
    sessionOptions = onnxruntime.SessionOptions()
    sessionOptions.intra_op_num_threads = 1
    sessionOptions.inter_op_num_threads = 1
    session = onnxruntime.InferenceSession(model, sess_options=sessionOptions)

input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name
ta = datetime.datetime.now()
result = session.run([output_name], {input_name: img})
tb = datetime.datetime.now()

print(result[0][0][0:10], tb-ta)

Expected behavior

Results without NAN

Screenshots singel image infer:

$ /opt/anaconda3/bin/python onnx_run.py 
onnx-runtime version: 1.6.0
onnx version: 1.6.0
[-0.44730666 -0.08703414 -0.19790478 -0.27072993  0.11931454  0.19091064
 -0.56726605  0.18068509  0.43425757  0.11039752] 0:00:01.421296

batch image infer:

/opt/anaconda3/bin/python onnx_run.py 
onnx-runtime version: 1.6.0
onnx version: 1.6.0
(64, 3, 112, 112)
[nan nan nan nan nan nan nan nan nan nan] 0:00:01.136510

Additional context Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.

hariharans29 commented 3 years ago

Can you try upgrading to ORT 1.8.x ?

stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.