AxisCommunications / onnx-to-keras

Convert onnx models exported from pytorch to tensorflow keras models with focus on performace and highleve compatibility.
MIT License
25 stars 13 forks source link

Accuracy issue with MaxPool padding #31

Closed xsacha closed 2 years ago

xsacha commented 3 years ago
    def test_padding(self):
        x = np.random.rand(1, 3, 112, 112).astype(np.float32)
        model = torch.nn.Sequential(torch.nn.Conv2d(3, 64, (3, 3), 1, 1), torch.nn.MaxPool2d(3, 2, 1))
        convert_and_compare_output(model, x)

I isolated accuracy issues in a model to MaxPool being used with padding. This example above is a minimum repro to fail a tolerance of 1e-3.

The problem gets worse with larger models that contain multiple such methods. My large models with over 100 layers all get complete accuracy failure due to this padding issue.

Arrays are not almost equal to 3 decimals

Mismatched elements: 2302 / 193600 (1.19%) Max absolute difference: 0.82550406 Max relative difference: 0.2062616 x: array([[[[ 0.301, 0.54 , 0.498, ..., 0.476, 0.463, 0.485], [ 0.822, 0.822, 0.512, ..., 0.476, 0.634, 0.634], [ 0.725, 0.524, 0.512, ..., 0.469, 0.495, 0.495],... y: array([[[[ 0.301, 0.54 , 0.498, ..., 0.476, 0.463, 0.485], [ 0.822, 0.822, 0.512, ..., 0.476, 0.634, 0.634], [ 0.725, 0.524, 0.512, ..., 0.469, 0.495, 0.495],...


Ran 1 test in 2.327s

FAILED (failures=1)

After testing this somewhat, I discovered that tf.pad works instead, with -127.5.

                paddings = ((0,0), (pads[0], pads[2]), (pads[1], pads[3]), (0,0))
                x = tf.pad(x, paddings, constant_values=-127.5)

Don't ask me why. The docs say it zero-pads!

I feel like it might be avoiding some internal optimisation. Perhaps in the ONNX conversion?

Before change: Max error on my Keras model (image of George Bush): 1.9706213 After change: Max error on my Keras model (image of George Bush): 8.895993e-06

Basically, anyone using this MaxPool op with padding is likely to have an incorrect keras model output