onnx / onnxmltools

ONNXMLTools enables conversion of models to ONNX
https://onnx.ai
Apache License 2.0
1k stars 181 forks source link

Conversion from caffe to onnx through coreml produces paddings with -3.4e+38 value. #402

Open bwery opened 4 years ago

bwery commented 4 years ago

Hello,

I try to use conversion of caffe models to onnx using the path that is documented here, converting to coreml and then converting coreml model to onnx.

In the onnx file for some models, there are padding layers that are created for which padding value is set to constant with value -3.4e+38. This padding value is visible while examining the onnx file with netron, while I do not see anything similar in the coreml model. That the reason why I think this is produced during the onnx conversion, but may be I am wrong.

I don't know if this is a bug, a limitation or a something I am doing wrong.

The network I used is the Alexnet network provided in DIGITS, with very few modifications. Difficult to attach because this network has a lot of parameters.

I have discovered this error because errors are generated when I try using it with TensorRT, which do not accept non-null values in padding.

May-be there is a way to force this padding to 0, but I do not find how to do it.

My environment Ubuntu 18.04 in an lxc container, python 3.6.9, caffe model produced with the last BLVC release, coremltools 3.4, onnxmltools 1.7.0.

I tried to generate with opset= 7 and opset = 10.

bwery commented 4 years ago

I have just created a small network derived from the faulty one, removing some layers and reducing their size to very small values.

This network of course is not very useful but can demonstrate the error.

In the attached file, you can finf the caffe, the coreml and the onnx files.

Classification.zip

bwery commented 4 years ago

Hello again.

Thank you for having taken this into account.

I have made a trial with the master branch. It appears that the padding size array is now filled properly with zeroes. Nevertheless, the padding value that is associated is still -3.4e+38.

In principle, it should not cause any trouble, except that TensorRT which is my target rejects the padding operator if value is not zero... So the generated file still cannot be loaded as is.

I just made a trial (just a trial because this is probably not the right solution), removing the conditional expression on line 252 of file convert/coreml/operator_converters/neural_network/Pool.py, making "padded_value" always being 0.

This solves the issue with TensorRT.

May-be this can be considered an issue for TensorRT and not onnxmltools, but modifying the behaviour to have a value of 0 when padding sizes are 0 would avoid a lot of headaches !

Note that my current target is TensorRT 7.0.0.11 for Windows.