Closed Antoine-Prieur closed 2 years ago
Hi, @Antoine-Prieur. Was the visualization okay? In my testing, the result is not Nan.
Hello, thank for your response,
I just tried, and the visualization is not okay neither, it just shows some zeros with onnxruntime, and the right prediction with pytorch.
May it be the fact that I don't have a GPU on the machine where I'm trying to convert the model ? Because I did a lot of tests on my side only on CPU, with different opset, torch versions, ... which always gave NaN values with the exact same messages during the conversion. Some of the operations in the graph may not be supported by CPU.
Hello, thank for your response,
I just tried, and the visualization is not okay neither, it just shows some zeros with onnxruntime, and the right prediction with pytorch.
May it be the fact that I don't have a GPU on the machine where I'm trying to convert the model ? Because I did a lot of tests on my side only on CPU, with different opset, torch versions, ... which always gave NaN values with the exact same messages during the conversion. Some of the operations in the graph may not be supported by CPU.
Did the error only show for SATRN or for other models as well? There is no operator that only supports GPU for SATRN. In fact, I ran the script you gave above successfully and it used only the CPU.
I tried to convert CRNN, and it worked well. I also did it with a few detection models, which worked fine too.
I've just done a fresh install, following the exact version in the installation guide, and I still have the same NaN issue when I watch the given tensor, and the vizualisation gives me zeros (it probably replaces NaN with zeros).
My guess is that I miss a CUDA/cUDNN dependency somewhere, I'm gonna try to do the same on a GPU cluster to see if it works.
I tried on two other different machines (with GPU this time), and I always have the exact same problem. I also installed the dependencies to support onnx-optimizer (to remove the warning WARNING - Can not optimize model, please build torchscipt extension.
I had earlier). I made sure to have the exact same config file as satrn-config, and used the weights found here text recognition models for satrn-small.
Maybe I'm missing something in the setup of the project. I first created the Python venv (using conda) following the exact same thing as the guide for Linux. I installed onnxruntime, downloaded the linux prebuilt binary package, and exported the path to ONNXRUNTIME_DIR
and LD_LIBRARY_PATH
. To build, I used this command to support ort and ort optimization:
cmake -DCMAKE_CXX_COMPILER=g++-7 -DTorch_DIR=${Torch_DIR} -DMMDEPLOY_TARGET_BACKENDS="ort;torchscript" -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR}
(with the Torch_DIR
variable set with export Torch_DIR=$(python -c "import torch;print(torch.utils.cmake_prefix_path + '/Torch')")
). Then, I installed the project with pip install -e .
, and MMOCR and his dependencies with:
pip install mmocr==0.4.1
pip install mmdet==2.20.0
I finally executed the script I sent earlier.
Okay, I will follow your step later and check if there is any possible bug.
Hi, @Antoine-Prieur. The bug get fixed in the latest mmdeploy.
Hello, thank you very much for your time, I tried the fix and it works well. Have a good day !
@AllentDan @Antoine-Prieur did you try to run inference with batch?
I tried and the result is not correct I think
result from 2nd input aways same word the
after decoding it
from onnxruntime import InferenceSession
import numpy as np
onnx_model = InferenceSession("models/textrecog/satrn_small/end2end.onnx")
inp = np.random.randn(3, 32, 100)
inps = np.array([inp for i in range(5)])
test = onnx_model.run(input_feed={"input": inps.astype(np.float32)}, output_names=["output"])
ocr = test[0]
dictionary = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?@[\\]_`~"
for i in range(ocr.shape[0]):
max_indices = []
for outer in range(ocr.shape[1]):
character_index = -1
character_value = 0
for inner in range(ocr.shape[2]):
value = ocr[i][outer][inner]
if value > character_value:
character_value = value
character_index = inner
max_indices.append(character_index);
recognized = ""
for max_index in max_indices:
if max_index == len(dictionary):
continue #unk
if max_index == len(dictionary) + 1:
break #eos
recognized += dictionary[max_index]
print("--->>>>>> recognized", recognized)
The result I got:
--->>>>>> recognized newsgroups:
--->>>>>> recognized the
--->>>>>> recognized the
--->>>>>> recognized the
--->>>>>> recognized the
Hello @Phelan164, have you seen this issue : https://github.com/open-mmlab/mmdeploy/issues/791 ? I had a similar problem before, there were an issue with the triu function, and SATRN is using it. It's now fixed on master, but the release is not out yet.
@Antoine-Prieur thanks for your answer. So to use the new triu function, need to build new sdk cpu + ONNXRuntime follow this for now?
@AllentDan @Antoine-Prieur did you try to run inference with batch? I tried and the result is not correct I think result from 2nd input aways same word
the
after decoding itfrom onnxruntime import InferenceSession import numpy as np onnx_model = InferenceSession("models/textrecog/satrn_small/end2end.onnx") inp = np.random.randn(3, 32, 100) inps = np.array([inp for i in range(5)]) test = onnx_model.run(input_feed={"input": inps.astype(np.float32)}, output_names=["output"]) ocr = test[0] dictionary = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!\"#$%&'()*+,-./:;<=>?@[\\]_`~" for i in range(ocr.shape[0]): max_indices = [] for outer in range(ocr.shape[1]): character_index = -1 character_value = 0 for inner in range(ocr.shape[2]): value = ocr[i][outer][inner] if value > character_value: character_value = value character_index = inner max_indices.append(character_index); recognized = "" for max_index in max_indices: if max_index == len(dictionary): continue #unk if max_index == len(dictionary) + 1: break #eos recognized += dictionary[max_index] print("--->>>>>> recognized", recognized)
The result I got:
--->>>>>> recognized newsgroups: --->>>>>> recognized the --->>>>>> recognized the --->>>>>> recognized the --->>>>>> recognized the
Have you solved this problem? I have the same problem.
Hello, first, thanks for your work !
Describe the problem
I wanted to ask if SATRN was compatible with ONNX as the documentation says ? I've tried to convert different versions of SATRN to ONNX, the conversion seems to work (with a few warnings during the conversion), but when I try my model, it always gives me a tensor with only NaN values. I've tried with the full version of the model that I've trained with a custom config, and also the small version with the weights given in the MMOCR documentation, using the default config.
Reproduction
I used the following command to convert the model:
which gave me the following output:
And the following code to test the model :
The variable
test
contains:Environment
Thanks a lot in advance !