I isolated accuracy issues in a model to MaxPool being used with padding.
This example above is a minimum repro to fail a tolerance of 1e-3.
The problem gets worse with larger models that contain multiple such methods. My large models with over 100 layers all get complete accuracy failure due to this padding issue.
I feel like it might be avoiding some internal optimisation. Perhaps in the ONNX conversion?
Before change:
Max error on my Keras model (image of George Bush): 1.9706213
After change:
Max error on my Keras model (image of George Bush): 8.895993e-06
Basically, anyone using this MaxPool op with padding is likely to have an incorrect keras model output
I isolated accuracy issues in a model to MaxPool being used with padding. This example above is a minimum repro to fail a tolerance of 1e-3.
The problem gets worse with larger models that contain multiple such methods. My large models with over 100 layers all get complete accuracy failure due to this padding issue.
After testing this somewhat, I discovered that tf.pad works instead, with -127.5.
Don't ask me why. The docs say it zero-pads!
I feel like it might be avoiding some internal optimisation. Perhaps in the ONNX conversion?
Before change: Max error on my Keras model (image of George Bush): 1.9706213 After change: Max error on my Keras model (image of George Bush): 8.895993e-06
Basically, anyone using this MaxPool op with padding is likely to have an incorrect keras model output