Closed maybeLee closed 3 years ago
@maybeLee,
As per my understanding, this is expected because each Backend
uses different Compilers
and different Weight Initialization
Methods, and they might be implemented in different Languages
as well.
However, getting Reproducible Results using 1 Backend
(example Tensorflow
) is possible, as shown in this comment. Thanks!
@rmothukuru,
Thanks for your reply, to reduce the influence of weight initialization and randomness, I built a one-layer (LSTM) model and save its weights as model.h5
for different backends to use.
import os
np.random.seed(10)
if not os.path.exists("model.h5"):
print("creating new model")
model = keras.models.Sequential()
model.add(layers.LSTM(5))
else:
print("model exists, loading old model")
model = keras.models.load_model("model.h5")
x = np.random.rand(1,10,1)
pred = model.predict(x)
The result shows that there still exists large inconsistency:
tensorflow: [[-0.05560498 0.03696749 0.0953454 -0.04178171 0.00761575]]
theano: [[-0.08818588 0.07905697 0.1577689 -0.11601128 0.00531999]]
cntk: [[-0.18581256 0.15982811 0.3080599 -0.21439359 0.01037333]]
mxnet: [[0. 0. 0. 0. 0.]]
From my perspective, this inconsistency issue is quite important, when developers are running pretrained models on different backends, they expect the output to be similar.
Hi @maybeLee, support for non-TF backends was dropped multiple keras versions ago so we would suggest only paying attention to the TF backend.
As for why they may have differed:
Hi @tomerk, thanks for your explanation. As I tested earlier, when I save the LSTM layer and its weight as a model, such inconsistency issue still exists when using different backend to run inference on such model. Besides, I am not sure simply performing model inference will introduce any variance factor. Another possible explanation for such inconsistency can be: for different backends, the implementations of the LSTM layer are different. I understand this may be difficult to debug, however, it is very essential if there exists bugs in the implementation of LSTM.
Anyway, I will do more experiments to see if I can help you identify the root cause of such inconsistency.
@maybeLee, If you would like to debug the Tensorflow implementation specifically to see if anything misbehaves then feel free and please do file a separate bug if you spot that it is misbehaving. (Please use the newest version of both TF and Keras to try this instead of TF 2.0).
As for the other backends: As I mentioned above they are no longer under development. Even if you do find an issue with their existing implementations there is no repository to submit the changes to and no new releases will be made.
@rmothukuru please feel free to close this issue as it is not actionable for us at this time
@maybeLee, Closing the issue as per @tomerk's comments. Please file a new bug as per his suggestion. Thanks!
System information.
Describe the problem. Given a randomly generated input array, keras.layers.LSTM API will output inconsistent result when using different backends. For
TensorFlow
,CNTK
, andTheano
, the LSTM layer will output non-zero but inconsistent value even though the random seed is fixed. Formxnet
backend, the LSTM layer will directly output zero.Source code / logs. Please reproduce this issue by running the following script:
Command to reproduce (suppose above script is saved as file:
try.py
):Output:
Related backend version is:
Problem exists when I use either keras 2.2.4 or keras 2.3.