Closed VinuthaRaghavendra closed 4 years ago
Prediction from onnx is same for all images
Can you try OnnxRuntime to run inference on the ONNX model instead of WinMLTools? This will tell if the difference is caused by inference engine.
Also, is there a link to the frozen TF model you are using, and the TF2ONNX command line you used to convert the model?
https://github.com/microsoft/onnxruntime
# Compute the prediction with ONNX Runtime
import onnxruntime as rt
import numpy
sess = rt.InferenceSession("rf_iris.onnx")
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
I tried OnnxRuntime Python API as well, prediction for all images are same.
Using code: import numpy as np import onnxruntime as rt from PIL import Image,ImageDraw sess = rt.InferenceSession("tiny_yolov2/model.onnx") input_name = sess.get_inputs()[0].name img = Image.open('test.jpg') img = img.resize((832, 832)) #for tiny_yolov2 X = np.asarray(img) X = X.transpose(2,0,1) X = X.reshape(1,3,832,832) out = sess.run(None, {input_name: X.astype(np.float32)}) out = out[0][0] numClasses = 2 anchors = [1.08, 1.19, 3.42, 4.41, 6.63, 11.38, 9.42, 5.11, 16.62, 10.52] def sigmoid(x, derivative=False): return x(1-x) if derivative else 1/(1+np.exp(-x))def softmax(x): scoreMatExp = np.exp(np.asarray(x)) return scoreMatExp / scoreMatExp.sum(0) clut = [(0,0,0),(255,0,0),(255,0,255),(0,0,255),(0,255,0),(0,255,128), (128,255,0),(128,128,0),(0,128,255),(128,0,128), (255,0,128),(128,0,255),(255,128,128),(128,255,128),(255,255,0), (255,128,128),(128,128,255),(255,128,128),(128,255,128)] label = ["aeroplane","tvmonitor"] draw = ImageDraw.Draw(img) for cy in range(0,13): for cx in range(0,13): for b in range(0,5): channel = b(numClasses+5) tx = out[channel ][cy][cx] ty = out[channel+1][cy][cx] tw = out[channel+2][cy][cx] th = out[channel+3][cy][cx] tc = out[cy][cx][channel+4] x = (float(cx) + sigmoid(tx))32 y = (float(cy) + sigmoid(ty))32
w = np.exp(tw) 32 anchors[2b ] h = np.exp(th) 32 anchors[2b+1]
confidence = sigmoid(tc)classes = np.zeros(numClasses) for c in range(0,numClasses): classes[c] = out[channel + 5 +c][cy][cx] classes = softmax(classes) detectedClass = classes.argmax() if 0.5< classes[detectedClass]*confidence: color =clut[detectedClass] x = x - w/2 y = y - h/2 draw.line((x ,y ,x+w,y ),fill=color) draw.line((x ,y ,x ,y+h),fill=color) draw.line((x+w,y ,x+w,y+h),fill=color) draw.line((x ,y+h,x+w,y+h),fill=color) img.save("result.png")
Try this command line instead to generate the model. You should see different scores with different inputs (see below).
python -m tf2onnx.convert --input tiny-yolo-voc-3c.pb --inputs input:0 --outputs output:0 --opset 11 --output my.onnx
import sys
import onnxruntime as rt
import numpy as np
myfile = sys.argv[1]
sess = rt.InferenceSession(myfile)
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
shape = (1, 832, 832, 3)
sample = 255 * np.random.random(shape).astype(np.float32)
scores = sess.run([output_name], {input_name: sample})[0]
print ('scores1=\n',scores[0][0][0][:5])
sample = 255 * np.random.random(shape).astype(np.float32)
scores = sess.run([output_name], {input_name: sample })[0]
print ('scores2=\n',scores[0][0][0][:5])
scores1=
[-0.04436302 -0.02913633 0.09463985 0.34088922 -3.1391654 ]
scores2=
[-0.04425232 -0.02944296 0.09664 0.34297347 -3.1432114 ]
But now, predictions from both graph are not matching. Can you please help
And one more doubt - Frozen graph exported from darkflow tiny-yolo-voc.weights has accuate results when converted to .onnx But frozen graph exported from checkpoints and converted to .onnx has complete mismatch of the predictions(always has confidence score equal to 99)
But frozen graph exported from checkpoints and converted to .onnx has complete mismatch
Tf2onnx has --check_point
parameter to load models from checkpoint format (instead of --input
, which loads a frozen model instead). Did you use that argument to convert from checkpoint format, or something else?
If tf2onnx is able to convert a model, the model is typically accurate -- otherwise you'll see a glaring conversion error if it cannot handle it.
With the scores being equal to 99, one suspicion is that the input data is wrong -- can you double check the NCHW vs NHWC memory layout of your input vector? Another possibility is that the pixel values are out of range of valid values, which can produce extreme results for some models.
Another option, have a look at "Method 2" here https://github.com/onnx/tensorflow-onnx/issues/729#issuecomment-552058866 . This will convert the model and also compare the outputs of TF and Onnx models for accuracy in an easy-to-reproduce way.
I have tried using --checkpoint parameter to freeze graph, but I get ValueError: Input 0 of node 0-convolutional_/cond_1/AssingMovingAvg/Switch was passed float from 0-convolutional/moving_mean:0 incompatible with expected float_ref
I have used parameter --input_as_NCHW to convert input and also applied normalization further(Nan error was resolved).
Can you please brief how to convert NHWC to NCHW in method 2
For 1, "float_ref" usually occurs when there's a Variable in the checkpoint model file, and because Variables are not supported in Onnx, it's a halting condition for tf2onnx. You generally cannot proceed with the conversion unless you modify your TF model to avoid Variables in the frozen model.
For 3, there's no direct way to specify inputs_as_nchw
in Method #2 (maybe that's useful to have in future) , but you can follow the code path and pass it as an argument into the function process_tf_graph()
The drive.google.com link points to a top level folder called Inception_V2, but the folder is empty. Maybe the model needs to be shared as well?
The model converts successfully for me using the master branch, opset 11, and TF ver 1.14.
Can you compare your run command with the one below?
python -m tf2onnx.convert --graphdef frozen_inference_graph.pb --output model.onnx \
--fold_const --opset 11 --verbose\
--inputs image_tensor:0 \
--outputs num_detections:0,detection_boxes:0,detection_scores:0,detection_classes:0
...
2020-05-04 11:22:44.945899: I tensorflow/tools/graph_transforms/transform_graph.cc:317] Applying fold_old_batch_norms
2020-05-04 11:22:46,490 - INFO - tf2onnx: inputs: ['image_tensor:0']
2020-05-04 11:22:46,491 - INFO - tf2onnx: outputs: ['num_detections:0', 'detection_boxes:0', 'detection_scores:0', 'detection_classes:0']
2020-05-04 11:22:46,906 - INFO - tf2onnx.tfonnx: Using tensorflow=1.14.0, onnx=1.6.0, tf2onnx=1.6.0/82f805
2020-05-04 11:22:46,906 - INFO - tf2onnx.tfonnx: Using opset <onnx, 11>
...
2020-05-04 11:23:12,389 - INFO - tf2onnx.optimizer: After optimization: Cast -76 (265->189), Const -561 (993->432), Div -4 (29->25), Flatten -1 (2->1), Gather +2 (41->43), Identity -75 (76->1), Less -5 (26->21), Mul -3 (59->56), ReduceMean -1 (2->1), Shape -6 (54->48), Slice -7 (102->95), Squeeze -21 (129->108), Transpose -168 (192->24), Unsqueeze -50 (125->75)
2020-05-04 11:23:12,666 - INFO - tf2onnx:
2020-05-04 11:23:12,667 - INFO - tf2onnx: Successfully converted TensorFlow model frozen_inference_graph.pb to ONNX
2020-05-04 11:23:12,815 - INFO - tf2onnx: ONNX model is saved at model.onnx
But using same command I get value error:
File "/home/falconeye-ai/Documents/tensorflow-onnx-master/tf2onnx/onnx_opset/nn.py", line 451, in version_11
pads = node.inputs[1].get_tensor_value()
File "/home/falconeye-ai/Documents/tensorflow-onnx-master/tf2onnx/graph.py", line 260, in get_tensor_value
raise ValueError("get tensor value: {} must be Const".format(self.name))
ValueError: get tensor value: SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/while/PadOrClipBoxList/stack_2_Concat2432 must be Const
Traceback (most recent call last):
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"main__", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/falconeye-ai/Documents/tensorflow-onnx-master/tf2onnx/convert.py", line 161, in
Are you still using Python 3.5?
Python 3.5 is not supported by tensorflow-onnx. Can you re-try with Python 3.6 or 3.7?
Assuming this is no longer an issue, since the model above converted successfully in a test run. Reopen if needed.
Describe the bug A clear and concise description of what the bug is. Prediction from onnx is same for all images(when darkflow is converted to onnx) Urgency If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.
System information
To Reproduce Describe steps/code to reproduce the behavior:
Darkflow https://github.com/thtrieu/darkflow is used for training. Checkpoints are converted to onnx using tf2onnx Predictions on C# is done using https://docs.microsoft.com/en-us/windows/ai/windows-ml/convert-model-winmltools Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.