A large output inconsistency between keras (.h5), tensorflow (.pb), and onnx on the same model with same weights

maybeLee commented 2 years ago

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.8.0
Python version: 3.7
Bazel version (if compiling from source): N/A
GPU model and memory: N/A
Exact command to reproduce: Please follow this link to reproduce the issue: https://colab.research.google.com/drive/1lHeZVtskThU8Baa7DFX0cuSyEoUu7PP6?usp=sharing

Describe the problem. In short, I observe an output inconsistency between keras model's predict function and tensorflow's protobuf and onnx's prediction result on a specific same model and same model weights.

The model that trigger this issue is very simple (with only two layers)

_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 49, 50)]          0         

 lstm (LSTM)                 (None, 100)               60400     

 softmax (Softmax)           (None, 100)               0         

=================================================================
Total params: 60,400
Trainable params: 60,400
Non-trainable params: 0
_________________________________________________________________

The condition to trigger such inconsistency is to set the axis of Softmax layer to be -2:

layer_stack = [
               keras.layers.LSTM(100),
               keras.layers.Softmax(axis=-2)
]
x = keras.layers.Input((49, 50))
layer_input = x
for layer in layer_stack:
  y = layer(layer_input)
  layer_input = y
keras_model = keras.models.Model(x, y)
keras_pred = keras_model.predict(input)
keras_model.save("generated_model.h5")
keras_model.summary()

After generating such a model with randomly generated input, I find out that there is a large inconsistency between keras_model.predict, tensorflow's session.run (yes I convert the model to a tensorflow protobuf using some code), and onnxruntime's prediction.

The prediction difference is as follows: Difference between TensorFlow (protobuf) and Keras: 300.0 Difference between TensorFlow (protobuf) and ONNXRuntime: 2.8684735e-07 Difference between Keras and ONNXRuntime: 300.0 Therefore, I suspect that this is an implementation bug inside Keras.

Below explains the detail of how I compare the result of different implementations (I paste all the implementation code below so this issue may be long......):

For the input, I always use an input tensor randomly generated by numpy:

input = np.random.rand(100, 49, 50)

After generating this input, I will send the same input to all other implementations.

To get the keras' result, I use the most common way: construct a Keras model by keras.models.Model and use this constructed model to predict on the generated input and get keras result. I further save this model as generated_model.h5 so the weights to be loaded to other implementations will be exactly the same as keras's model
To get the tensorflow (protobuf)'s result, I follow the post: https://medium.com/@johnsondsouza23/export-keras-model-to-protobuf-for-tensorflow-serving-101ad6c65142 to convert the generated_model.h5 to generated_model.pb (tensorflow's protobuf format. You may access this colab link: https://colab.research.google.com/drive/1lHeZVtskThU8Baa7DFX0cuSyEoUu7PP6?usp=sharing to know how I did this.
To get the prediction result on ONNXRuntime. I first use tf2onnx to convert the .h5 model to .onnx model, I further use the onnxruntime module to inference on the .onnx model. Again, you may also follow the colab link to know how I did it.

Describe the current behavior. Keras's h5 model output different than tensorflow's protobuf and onnx

Describe the expected behavior. These three outputs should be the same.

Contributing.

Do you want to contribute a PR? (yes/no): yes, please let me know where I can help (I am not very familiar with the implementation of Keras)
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

sushreebarsa commented 2 years ago

@gadagashwini I was able to replicate this issue on colab using TF v2.8.0, please refer to this gist.Thanks!

k-w-w commented 2 years ago

I looked at the code and could not see any obvious errors with the conversions. The problem does appear to only trigger when there is a SoftMax layer, with axis=-2, and when converting to frozen graph.

A simple save and load leads the results that match the previous keras predictions:

model = keras.models.load_model("generated_model.h5")
h5_pred = model.predict(input)
np.sum(keras_pred - h5_pred)  # 0

So the problem probably lies in the convert_variables_to_constants (from tensorflow.python.framework.graph_util import convert_variables_to_constants). I'd check with the TensorFlow tor TF2Onyx teams on this issue, since things work correctly from the Keras side of things.

A note: The TensorFlow APIs used in the colab are private or v1, i.e. no longer maintained (import_graph_def, convert_variables_to_constant). There is a convert_variables_to_constants_v2 which operates on tf.function that you could look into.

google-ml-butler[bot] commented 2 years ago

Are you satisfied with the resolution of your issue? Yes No

keras-team / keras

A large output inconsistency between keras (.h5), tensorflow (.pb), and onnx on the same model with same weights #16348