Closed OscarPedaVendere closed 1 year ago
Hi @OscarPedaVendere, I replicate using your model and get the same error as yours. I also ran your ONNX model with benchmark_app and received this error:
RuntimeError: While validating ONNX node '<Node(Conv): res3a_branch2a>':
Check 'window_dilated_dim <= data_padded_dilated_dim' failed at C:\j\workspace\private-ci\ie\build-windows-vs2019@3\b\repos\openvino\src\core\shape_inference\include\convolution_shape_inference.hpp:217:
While validating node 'v1::Convolution Convolution_460 (re_lu_4/Relu:0[0]:f32{1,64,1,1}, res3a_branch2a_W_new[0]:f32{128,64,3,3}) -> (dynamic...)' with friendly_name 'Convolution_460':
Window after dilation has dimension (dim: 3) larger than the data shape after padding (dim: 2) at axis 0.
I checked your ONNX model using Netron and it looks crumpled:
Can you share the source of the original model? We will take a look at this to see if this MO error can be solved and if this model is supported.
@zulkifli-halim please create a jira ticket once you've received the model and assign it to me
Hi @OscarPedaVendere, I replicate using your model and get the same error as yours. I also ran your ONNX model with benchmark_app and received this error:
RuntimeError: While validating ONNX node '<Node(Conv): res3a_branch2a>': Check 'window_dilated_dim <= data_padded_dilated_dim' failed at C:\j\workspace\private-ci\ie\build-windows-vs2019@3\b\repos\openvino\src\core\shape_inference\include\convolution_shape_inference.hpp:217: While validating node 'v1::Convolution Convolution_460 (re_lu_4/Relu:0[0]:f32{1,64,1,1}, res3a_branch2a_W_new[0]:f32{128,64,3,3}) -> (dynamic...)' with friendly_name 'Convolution_460': Window after dilation has dimension (dim: 3) larger than the data shape after padding (dim: 2) at axis 0.
I checked your ONNX model using Netron and it looks crumpled:
Can you share the source of the original model? We will take a look at this to see if this MO error can be solved and if this model is supported.
Hi. Thank you in advance for you support. I don't know why you see the model crumpled i can successfully open it with an onnx viewer webapp: link to the image
However here's two of the scripts that build the network (network structure itself and model builder class):
base_model.py.txt model_builder.py.txt
Thank you again for your support.
Hi @tomdol I have created JIRA for this case;
Ref : 94180
Thank you very much. Just let me know if you have any updates on this any time soon.
Sry, Any updates on this?
Hey @OscarPedaVendere Sorry you have to wait. It's planned to be fixed, but unfortunately is in queue due our current roadmap and schedule. But we will do this for sure. Stay tuned.
Hey @OscarPedaVendere Sorry you have to wait. It's planned to be fixed, but unfortunately is in queue due our current roadmap and schedule. But we will do this for sure. Stay tuned.
Ok thank you for your patience and your work. Looking forward to it :)
I've tested the model using onnxruntime and it failed with:
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from /home/mbencer/models/engine_file.onnx failed:This is an invalid model. In Node, ("tf_op_layer_Sum/Sum_reduce_min", ReduceSum, "", -1) : ("image_input": tensor(float),) -> ("tf_op_layer_Sum/Sum:0",) , Error Unrecognized attribute: axes for operator ReduceSum
also onnx checker:
import onnx
path = "/home/mbencer/models/engine_file.onnx"
onnx.checker.check_model(path)
failed for this model with:
File "/home/mbencer/venv/ov/lib/python3.8/site-packages/onnx/checker.py", line 97, in check_model
C.check_model_path(model)
onnx.onnx_cpp2py_export.checker.ValidationError: Unrecognized attribute: axes for operator ReduceSum==> Context: Bad node spec for node. Name: tf_op_layer_Sum/Sum_reduce_min OpType: ReduceSum
The reason is passing to ReduceSum axes as attribute (it was a way from opset11 - https://onnx.ai/onnx/operators/onnx__ReduceSum.html#reducesum-11) while the model is produced in opset13 (where ReduceSum axes is passed as input - https://onnx.ai/onnx/operators/onnx__ReduceSum.html#reducesum-13).
@OscarPedaVendere Could you provide the script used to TensorFlow-ONNX conversion?
@OscarPedaVendere The second problem (based on my experiments - the last one) is using HardSigmoid actication function by LSTM:
We can add support for the new activation function for our LSTMSequence
core op (now the supported activation functions are: sigmoid, tanh, tanh), but it can be not very quick (it's a new feature), so please consider if other activation function here is applicable (at least as temporary solution).
@mbencer Thank you for replying me. I guess that the problem with LSTM could be solved in some way or another even though it could be a major issue but I'll try to figure it out by myself.
Both the problems seem to arise from the library keras2onnx, which is basically what I use to export the model. For the hardsigmoid, I don't think ( but i'll check it better soon) that I've given an hardsigmoid function as LSTM export. I guess it's the default behaviour?
For the reducesum problem it's the same reason. here's part of the code devolved to the export: onnxexport.txt
I don't know how to fix this.. maybe changing the keras2onnx library version, hoping that this doesn't cause too much dependencies problems.
Am i being exhaustive with these replies? Thank you in advance
@OscarPedaVendere Thank you for response. I think that I have all needed information (at least for now). I'll try to export the model from Keras on my side (checking also if the version matters).
@OscarPedaVendere Could you provide me also the model saved from Keras? In your model_builder.py
script I haven't parametrs from experiment_spec
arguments and models.backbones
, models.base_model
@mbencer Thank you for your replies.
This model is part of a larger library that would not make sense to export as a whole. I've created a zip and run just a check if everything needed is there. To me it is not feasible atm to extract and check the whole library. I guess this could be allright anyway. Here's the zip.
The experiment_spec is a class that you can initialize with the collections.namedtuple() function after reading the contents in the specs/arabic_spec.txt file. That should be it; let me know if you have some errors while loading the spec.
Hi @OscarPedaVendere, I've reproduced converting on my side and I think I have a solution. When I explicitly define the target opset version to 12, like:
keras_to_onnx(eval_model, "model.onnx", target_opset=12)
everything work. I've tested it with pip install tensorflow==1.15.5 keras==2.2.4 keras2onnx==1.7.0 onnx
using python 3.7
.
In such opset version axes can be passed to ReduceSum as an attribute and LSTM is created with Sigmoid instead unsupported HardSigmoid.
Confirmed on the direct inference via benchmark_app (./benchmark_app -m model.onnx --shape [1,3,48,96]
) and using ModelOptmizer (mo -w model.onnx --input_shape [1,3,48,96] --input image_input --output tf_op_layer_ArgMax
)
Please let me know if such solution works for you.
@mbencer
OMG thank you so much! Thank you thank you. It works!
So all I had to do is (follow the main train CJ) set the target opset to 12.. but i didn't know it was an option. Thank you so much.
So now that I have generated the IR I can use it for inference on dlstreamer, right? Are the weights of the model included as well, as I exported it from a checkpoint? If not, how do I train the model in openvino?
OpenVINO and DL-Streamer can use ONNX files directly for inference - just provide the path and filename to the ONNX-file (as with the above example ./benchmark_app -m model.onnx --shape [1,3,48,96]
).
Using the Model-optimizer (MO) the ONNX-file can be converted to IR-format, consisting of a XML- (network and topology) and a BIN-file (weights).
So, yes, you can use the IR-format files with DL-Streamer (or the ONNX-file) - requiring to provide a model-proc-JSON-file.
So now that I have generated the IR I can use it for inference on dlstreamer, right? Are the weights of the model included as well, as I exported it from a checkpoint? If not, how do I train the model in openvino?
Yes, the weights are saved in the produced ONNX model and also in IR format (where *.xml
contains topology and *.bin
weights).
@OscarPedaVendere Please confirm if everything (especially this DL-Streamer part) works for you and if we can close the issue.
Thank you @mbencer @brmarkus for your info. Indeed the model optimizer part works. So I am able to correctly export an IR format from ONNX custom model. The problem now is that perhaps DLStreamer doesn't accept it? it says Unsupported activation function; don't know if it has to do with the model itself or my setup. Tried with the openvino samples and it all works. Should I close this issue and open it in the dlstreamer github?
Output is this: log_output.txt
I am running the model via gst_launch on an image (can't build a proper pipeline right now). The proper pipeline should be RTSP input -> decode -> detect -> detect -> classify -> multiimagesink But my test pipeline is: Image input -> convert to video -> classify -> fakesink that should work too but it doesn't.
The classifier is my converted IR model. (Don't know if I have to use gvaclassify or gvadetect right now but tried both and doesn't make a difference).
Thank you in advance for the patience, support and help you gave me in this thread.
Also, on a different machine, openvino_2022.2.0.7713 C++ benchmark_app built on Ubuntu 20.04 gives me Unsupported activation function.
./benchmark_app -m ./engine_file.onnx -shape [1,3,48,96] [Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [Step 2/11] Loading OpenVINO Runtime [ INFO ] OpenVINO: OpenVINO Runtime version ......... 2022.2.0 [ INFO ] Build ........... 2022.2.0-7713-af16ea1d79a-releases/2022/2 [ INFO ] [ INFO ] Device info: [ INFO ] CPU [ INFO ] openvino_intel_cpu_plugin version ......... 2022.2.0 [ INFO ] Build ........... 2022.2.0-7713-af16ea1d79a-releases/2022/2 [ INFO ] [ INFO ] [Step 3/11] Setting device configuration [ WARNING ] Performance hint was not explicitly specified in command line. Device(CPU) performance hint will be set to THROUGHPUT. [Step 4/11] Reading network files [ INFO ] Loading network files [ INFO ] Read network took 202.52 ms [ INFO ] Original network I/O parameters: Network inputs: image_input (node: image_input) : f32 / [...] Network outputs: tf_op_layer_ArgMax (node: tf_op_layer_ArgMax) : i64 / [...] tf_op_layer_Max (node: tf_op_layer_Max) : f32 / [...] [Step 5/11] Resizing network to match image sizes and given batch [ WARNING ] image_input: layout is not set explicitly, so it is defaulted to NCHW. It is STRONGLY recommended to set layout manually to avoid further issues. [ INFO ] Reshaping network: 'image_input': {1,3,48,96} [ INFO ] Reshape network took 5.85 ms [Step 6/11] Configuring input of the model [ INFO ] Network batch size: 1 Network inputs: image_input (node: image_input) : u8 / [N,C,H,W] Network outputs: tf_op_layer_ArgMax (node: tf_op_layer_ArgMax) : i64 / [...] tf_op_layer_Max (node: tf_op_layer_Max) : f32 / [...] [Step 7/11] Loading the model to the device [ ERROR ] Unsupported activation function
@OscarPedaVendere Are you sure that you are using updated version of the model? (with target opset12) I've tested the model both on the current master and on af16ea1d79
version (from your message) and it works for me.
Could you upload the model from the current conversion?
Yes i can confirm that the exported model with the target opset=12 is correctly processed by the model optimizer but it fails with the benchmark_app on both af16ea1d79
openvino build and dlstreamer/dlstreamer:devel
docker image on fitlet2.
Had successfully generated the .xml and .bin files but couldn't be able to use it for inference on benchmark_app or as GVADetect gstreamer plugin on dlstreamer.
Python 3.6.9 keras2onnx==1.7.0 tensorflow==1.15.5 keras==2.2.4
@OscarPedaVendere I'm confirming that in your exported model is used HardSigmoid
as activation function, while in my version is Sigmoid
. I belive that is the result of keras==2.2.4
used in my case. Could you try again with this keras version?
@mbencer I've checked indeed that the model exported has the HardSigmoid
activation function but i'm correctly exporting the model with target_opset=12 and I have the same library versions as yours except for onnx which is 1.8.0 and python version which is 3.6.9. I'm referring to my local export environment which is a whole library under a docker container.
I've tried changing onnx version, keras version and tensorflow version but couldn't do it either. The model is still correctly processed by the model optimizer though..
I don't know why, even with target opset 12, it exports the hardsigmoid function and still can't manage to export as sigmoid. Any suggestions on this?
Thank you in advance
@OscarPedaVendere I've converted your model also in docker environment with the following config:
FROM python:3.7
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y software-properties-common
RUN apt update && apt install -y \
git \
build-essential \
cmake \
libtbb2 \
gdb \
python3-pip && \
pip3 install --upgrade pip
RUN pip3 install -U tensorflow==1.15.5 keras==2.2.4 keras2onnx==1.7.0 onnx
COPY openvino_bugfix openvino_bugfix
ENV TF_KERAS = 1
RUN python3 openvino_bugfix/model_builder.py
@OscarPedaVendere Please let me know if such configuration works for you
@mbencer I'm sorry, I've tried with your config and also tried with another method that exports during training of the model but it still doesn't work... Even with target_opset=12 it doesn't export this *****ing HardSigmoid function as simple Sigmoid.
I don't know how better replicate your side but I think that a Dockerfile like yours is undoubtedly hard to mistake. Would you provide me better details? Any chance that HardSigmoid will be supported in the future anyway?
Thanks in advance
Hi @OscarPedaVendere, I've checked the script again and you are right, HardSigmoid
is still generated using this dockerfile (I've had some mess in my environment) - sorry for that. It should be:
RUN pip3 install -U numpy==1.18.5 tensorflow==2.2.0 keras==2.4.0 keras2onnx==1.7.0 onnx==1.12.0
I am uploading all the code to be sure that everything is the same (be aware, it's just a draft version) - export_onnx_model.zip
Please let me know if now it works correctly for you ;)
@OscarPedaVendere Can we close the ticket?
@mbencer Yes now it works! Thank you a lot for following me in these months! Thank you so much. I can confirm it works both with model optimizer and benchmark_app. In this case the activation is Sigmoid, finally and the output layer is a softmax in place of [tf_op_layer_Max, tf_op_layer_ArgMax] but i'll figure it out by myself on how to adapt the model to the openvino environment. Thank you so much.
System information (version)
Detailed description
Hello. I'm running dlstreamer/dlstreamer on a fitlet2 device https://fit-iot.com/web/products/fitlet2/ running Ubuntu 22.04. My goal is to run my custom model exported from Tensorflow (keras backend) to ONNX format and integrate it into the dlstreamer framework in order to compile custom C++ code that users gstreamer plugins and run inference on this custom model. Model is exported directly from tensorflow 1.15.5 running on Nvidia GPU after training for about 62 epochs. There is a problem when parsing a certain convolution layer. It says that the dilation size is greater than the padding size. Could you provide some help? What do I have to do by now? Do I have to keep the model and change export settings or do I have to use another exporting technique such as frozen model? Thank you in advance.
Steps to reproduce
Here is the link to the model: Model
Issue submission checklist