Open derekelkins opened 2 years ago
I do have the same problem:
ValueError: Dimensions must be equal, but are 4096 and 128 for '{{node Add_101}} = Add[T=DT_FLOAT](concat_22/concat, Add_101/y)' with input shapes: [?,4,?,4096], [128].
when has reshaping operation in some layers. At first I had successfully replaced reshape operation to split + concat, and it worked, but after some alterations to model, problem came back.
I tested the conversion with my own tool called onnx2tf, which I created as an alternative to onnx-tensorflow, and it successfully converted QLinearConv
with 1D bias. I am only interested in model conversions, not checking for accuracy degradation.
https://github.com/PINTO0309/onnx2tf
onnx2tf -i qlinear_conv_tensor_test.onnx
Converted tflite https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.20
Describe the bug
When
QLinearConv
is used with a 1-D bias, it is expecting the wrong shape. Specifically, given a 1x3x224x224x
input (NxCxHxW), a 64x3x7x7w
input (MxCxkHxkW) and a 64B
input (M) (with strides=[2,2] producing a 1x64x112x112 output) it fails with one of the following two assertions depending on whetherw_scale
is a scalar or a 1-D tensor (of length 64):w_scale
a scalar:w_scale
a 1-D tensor:To Reproduce
Run either of the attached models with a 1x3x224x224 input, e.g. with
ONNX model file
qlinear_conv_test_cases.zip
Python, ONNX, ONNX-TF, Tensorflow version
tensorflow-cpu
2.8.0 (on MacOSX)Additional context
Per the ONNX spec,
QLinearConv
takes in anx
input with dimensions NxCxHxW and aw
input with dimensions MxCxkHxkW (group=1 in this case).w_scale
andw_zero_point
can either be scalars or 1-D tensors of length M.B
is an optional input but must be a 1-D tensor of length M if provided.Looking at the code the
w_scale
parameter in the scalar case gets converted to a 1-D using thex.shape[1]
which presumably would be C which is 3. I would expect something likew.shape[0]
. The same should be used forw_zero_point
a few lines above. Ifw_scale
is a 1-D tensor with length 64 (M), then the failure occurs when the bias is added to the output. At this pointy
is 1x64x112x112 andB
has length 64. The line of code presumably relies on broadcasting rules that would work whenB
is a scalar, but I can't imagine would ever do the right thing whenB
is not a scalar.