onnx / onnx-tensorflow

Tensorflow Backend for ONNX
Other
1.29k stars 295 forks source link

Problem with PReLU conversion, bad parameter mapping #482

Open nhaduong opened 5 years ago

nhaduong commented 5 years ago

Describe the bug

I'm trying to convert EdgeNet pretrained models to Tensorflow (with onnx intermediary) for use in Unity. EdgeNet uses Pytorch's PRelu activation function, which should be tf.keras.layers.PReLU.

The conversion is converting the PReLU incorrectly, resulting in the error:

ValueError: Dimensions must be equal, but are 32 and 112 for 'mul' (op: 'Mul') with input shapes: [32], [1,32,112,112]

To Reproduce

Please give us instructions to reproduce your problem.

run the following:

import onnx
from onnx_tf.backend import prepare

onnx_model = onnx.load("espnetv2_s_0.5_imsize_224x224_imagenet.onnx")  # load onnx model
output = prepare(onnx_model).run(input)
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:37: UserWarning: Unknown op ConstantFill in domain `ai.onnx`.
  handler.ONNX_OP, handler.DOMAIN or "ai.onnx"))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:37: UserWarning: Unknown op ImageScaler in domain `ai.onnx`.
  handler.ONNX_OP, handler.DOMAIN or "ai.onnx"))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of IsInf in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of Mod in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of ThresholdedRelu in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
2019-08-14 16:41:28.824186: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1864, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 16 and 112 for 'mul' (op: 'Mul') with input shapes: [16], [1,16,112,112].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 55, in prepare
    return cls.onnx_model_to_tensorflow_rep(model, strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 75, in onnx_model_to_tensorflow_rep
    return cls._onnx_graph_to_tensorflow_rep(model.graph, opset_import, strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 129, in _onnx_graph_to_tensorflow_rep
    onnx_node, tensor_dict, handlers, opset=opset, strict=strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 224, in _onnx_node_to_tensorflow_op
    return handler.handle(node, tensor_dict=tensor_dict, strict=strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/handler.py", line 59, in handle
    return ver_handle(node, **kwargs)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/p_relu.py", line 38, in version_9
    return cls._common(node, **kwargs)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/p_relu.py", line 21, in _common
    neg = slope * (x - abs(x)) * 0.5
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 884, in binary_op_wrapper
    return func(x, y, name=name)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 1180, in _mul_dispatch
    return gen_math_ops.mul(x, y, name=name)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6490, in mul
    "Mul", x=x, y=y, name=name)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2027, in __init__
    control_input_ops)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1867, in _create_c_op
    raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 16 and 112 for 'mul' (op: 'Mul') with input shapes: [16], [1,16,112,112].

A self-contained piece of code that can demonstrate the problem is required.

Please do not expect us to have PyTorch, Caffe2 installed.

If a model exported from PyTorch and Caffe2 is having trouble in ONNX-TF, use the next section to attach the model.

ONNX model file

If applicable, attach the onnx model file in question using Gist, DropBox or Google Drive.

Any of the onnx models below in the repo, but below is a direct link to one. https://github.com/sacmehta/EdgeNets/blob/master/onnx_models/classification/espnetv2/espnetv2_s_0.5_imsize_224x224_imagenet.onnx

Python, ONNX, ONNX-TF, Tensorflow version

This section can be obtained by running get_version.py from util folder.

Additional context

Add any other context about the problem here.

fumihwh commented 5 years ago

@nhaduong Please try modify onnx_tf/handlers/backend/p_relu.py:L19 to slope = BroadcastMixin.explicit_broadcast([x, tensor_dict[node.inputs[1]]], 1)

nhaduong commented 5 years ago

@fumihwh Thank you for the response. There is now a new error: ValueError: strides > 1 not supported in conjunction with dilation_rate > 1

>>> output = prepare(onnx_model).run(input)
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:37: UserWarning: Unknown op ConstantFill in domain `ai.onnx`.
  handler.ONNX_OP, handler.DOMAIN or "ai.onnx"))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:37: UserWarning: Unknown op ImageScaler in domain `ai.onnx`.
  handler.ONNX_OP, handler.DOMAIN or "ai.onnx"))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of IsInf in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of Mod in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of ThresholdedRelu in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
2019-08-16 09:05:30.791347: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 55, in prepare
    return cls.onnx_model_to_tensorflow_rep(model, strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 75, in onnx_model_to_tensorflow_rep
    return cls._onnx_graph_to_tensorflow_rep(model.graph, opset_import, strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 129, in _onnx_graph_to_tensorflow_rep
    onnx_node, tensor_dict, handlers, opset=opset, strict=strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 224, in _onnx_node_to_tensorflow_op
    return handler.handle(node, tensor_dict=tensor_dict, strict=strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/handler.py", line 59, in handle
    return ver_handle(node, **kwargs)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/conv.py", line 11, in version_1
    return cls.conv(node, kwargs["tensor_dict"])
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/conv_mixin.py", line 150, in conv
    for (x, weight) in zip(xs, weight_groups)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/conv_mixin.py", line 150, in <listcomp>
    for (x, weight) in zip(xs, weight_groups)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 894, in convolution
    name=name)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 987, in convolution_internal
    data_format=data_format)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 1053, in __init__
    num_spatial_dims, strides, dilation_rate)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 756, in _get_strides_and_dilation_rate
    "strides > 1 not supported in conjunction with dilation_rate > 1")
ValueError: strides > 1 not supported in conjunction with dilation_rate > 1
tfygg commented 5 years ago

@fumihwh Thank you for the response. There is now a new error: ValueError: strides > 1 not supported in conjunction with dilation_rate > 1

>>> output = prepare(onnx_model).run(input)
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:37: UserWarning: Unknown op ConstantFill in domain `ai.onnx`.
  handler.ONNX_OP, handler.DOMAIN or "ai.onnx"))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:37: UserWarning: Unknown op ImageScaler in domain `ai.onnx`.
  handler.ONNX_OP, handler.DOMAIN or "ai.onnx"))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of IsInf in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of Mod in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/common/handler_helper.py:34: UserWarning: Fail to get since_version of ThresholdedRelu in domain `` with max_inclusive_version=9. Set to 1.
  handler.ONNX_OP, handler.DOMAIN, version))
2019-08-16 09:05:30.791347: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 55, in prepare
    return cls.onnx_model_to_tensorflow_rep(model, strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 75, in onnx_model_to_tensorflow_rep
    return cls._onnx_graph_to_tensorflow_rep(model.graph, opset_import, strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 129, in _onnx_graph_to_tensorflow_rep
    onnx_node, tensor_dict, handlers, opset=opset, strict=strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/backend.py", line 224, in _onnx_node_to_tensorflow_op
    return handler.handle(node, tensor_dict=tensor_dict, strict=strict)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/handler.py", line 59, in handle
    return ver_handle(node, **kwargs)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/conv.py", line 11, in version_1
    return cls.conv(node, kwargs["tensor_dict"])
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/conv_mixin.py", line 150, in conv
    for (x, weight) in zip(xs, weight_groups)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/onnx_tf/handlers/backend/conv_mixin.py", line 150, in <listcomp>
    for (x, weight) in zip(xs, weight_groups)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 894, in convolution
    name=name)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 987, in convolution_internal
    data_format=data_format)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 1053, in __init__
    num_spatial_dims, strides, dilation_rate)
  File "/anaconda3/envs/tf-onnx/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 756, in _get_strides_and_dilation_rate
    "strides > 1 not supported in conjunction with dilation_rate > 1")
ValueError: strides > 1 not supported in conjunction with dilation_rate > 1

i tested ok, maybe u should check your code.