Maxpool padding - Githubissues

xsacha commented 3 years ago

Fix accuracy issues

Fixes #31

xsacha commented 3 years ago

Unfortunately this does not convert to the EdgeTPU as the op now appears as PadV2 (unsupported). There must be a better way.

I manually hacked in a ZeroPad2d and implemented the op_pad functionality using the Keras Zero Pad (as it was in MaxPool) and managed to get an accurate model that converted all OPs successfully to Edge TPU. I'm not how to do this in the correct way in onnx-to-keras.

What I did that actually works on the EdgeTPU:

Manually edited the torchscript to remove padding from all max_pool2d and insert a constant_pad_nd before it.
input20 = torch.max_pool2d(input10, [3, 3], [2, 2], 1)
input10pad = torch.constant_pad_nd(input10, [1, 1, 1, 1], 0.)
input20 = torch.max_pool2d(input10pad, [3, 3], [2, 2])
Now when you run convert to ONNX, you'll get a lot of weird stuff that looks like this: Before -> After onnxsim and onnx2keras will ask for op_constantofshape. If you run python -m onnxsim that whole branch will just disappear.
Next up, we get an exception while working out what op we have:

x = getattr(self, 'op_' + op_type.lower())(*inputs, **attrs) TypeError: op_pad() got multiple values for argument 'mode'

To get around this, I fixed op_pad as in PR #34

xsacha commented 2 years ago

Due to #34 being merged, this PR is now valid.

The issue still occurs and this PR fixes it but it doesn't look good. If someone can explain why it would be nice.

95hali74 commented 2 years ago

What exactly doesn't look good? The extra nodes in the onnx model?

xsacha commented 2 years ago

The number -127.5 doesn't look good. I can't explain why it works.

Edit: Specifically, 0 doesn't work even though it's a ZeroPad. Negative 1 works but I found the best accuracy with -127.5. I assume torch, onnx and Tensorflow differ in how they implement padding but haven't taken an in-depth look at the code. You can see this result with the test I have added.

xsacha commented 2 years ago

Any update on this? I'm using this magic number (-127.5) in production and am happy with the end result.

95hali74 commented 2 years ago

I think this will work for now. But I think I found the reason why -127.5 seems to work better than 0. The inaccuracy occurs when the padding value is higher than the other values in the input tensor for a given kernel window. The maxpool op will then select the padding value rather than the other values in the input tensor.

xsacha commented 2 years ago

Thanks. That makes sense.

AxisCommunications / onnx-to-keras

Maxpool padding #32