AxisCommunications / onnx-to-keras

Convert onnx models exported from pytorch to tensorflow keras models with focus on performace and highleve compatibility.
MIT License
25 stars 13 forks source link

Maxpool padding #32

Closed xsacha closed 2 years ago

xsacha commented 3 years ago

Fix accuracy issues

Fixes #31

xsacha commented 3 years ago

Unfortunately this does not convert to the EdgeTPU as the op now appears as PadV2 (unsupported). There must be a better way.

I manually hacked in a ZeroPad2d and implemented the op_pad functionality using the Keras Zero Pad (as it was in MaxPool) and managed to get an accurate model that converted all OPs successfully to Edge TPU. I'm not how to do this in the correct way in onnx-to-keras.

What I did that actually works on the EdgeTPU:

x = getattr(self, 'op_' + op_type.lower())(*inputs, **attrs) TypeError: op_pad() got multiple values for argument 'mode'

To get around this, I fixed op_pad as in PR #34

xsacha commented 2 years ago

Due to #34 being merged, this PR is now valid.

The issue still occurs and this PR fixes it but it doesn't look good. If someone can explain why it would be nice.

95hali74 commented 2 years ago

What exactly doesn't look good? The extra nodes in the onnx model?

xsacha commented 2 years ago

The number -127.5 doesn't look good. I can't explain why it works.

Edit: Specifically, 0 doesn't work even though it's a ZeroPad. Negative 1 works but I found the best accuracy with -127.5. I assume torch, onnx and Tensorflow differ in how they implement padding but haven't taken an in-depth look at the code. You can see this result with the test I have added.

xsacha commented 2 years ago

Any update on this? I'm using this magic number (-127.5) in production and am happy with the end result.

95hali74 commented 2 years ago

I think this will work for now. But I think I found the reason why -127.5 seems to work better than 0. The inaccuracy occurs when the padding value is higher than the other values in the input tensor for a given kernel window. The maxpool op will then select the padding value rather than the other values in the input tensor.

xsacha commented 2 years ago

Thanks. That makes sense.