keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.92k stars 19.45k forks source link

LSTM layer output inconsistent on different backends #14992

Closed maybeLee closed 3 years ago

maybeLee commented 3 years ago

System information.

Describe the problem. Given a randomly generated input array, keras.layers.LSTM API will output inconsistent result when using different backends. For TensorFlow, CNTK, and Theano, the LSTM layer will output non-zero but inconsistent value even though the random seed is fixed. For mxnet backend, the LSTM layer will directly output zero.

Source code / logs. Please reproduce this issue by running the following script:

import os
import argparse
import sys
import warnings
parse = argparse.ArgumentParser()
parse.add_argument("--bk", type=str,default="tensorflow", help="the name of backend")
flags, _ = parse.parse_known_args(sys.argv[1:])
os.environ["KERAS_BACKEND"]=flags.bk
bk = flags.bk
gpu_list=["0"]
if bk == 'tensorflow':
    os.environ["TF_CPP_MIN_LOG_LEVEL"] = '2'
    import tensorflow as tf
    tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)  # ignore warnings
    batch_size = 32

if bk == 'theano':
    if len(gpu_list) == 2:
        os.environ['THEANO_FLAGS'] = f"device=cuda{gpu_list[0]},contexts=dev{gpu_list[0]}->cuda{gpu_list[0]};dev{gpu_list[1]}->cuda{gpu_list[1]}," \
                              f"force_device=True,floatX=float32,lib.cnmem=1"
    else:
        os.environ['THEANO_FLAGS'] = f"device=cuda{gpu_list[0]},contexts=dev{gpu_list[0]}->cuda{gpu_list[0]}," \
                                     f"force_device=True,floatX=float32,lib.cnmem=1"
    import theano as th

if bk == "cntk":
    from cntk.device import try_set_default_device, gpu
    try_set_default_device(gpu(int(gpu_list[0])))

import keras
from keras import initializers, layers
import numpy as np
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=UserWarning)
model = keras.models.Sequential()
model.add(layers.LSTM(5))
np.random.seed(10)
x = np.random.rand(1,10,2)
pred = model.predict(x)
print(pred)

Command to reproduce (suppose above script is saved as file: try.py):

python try.py --bk tensorflow
python try.py --bk theano
python try.py --bk cntk
python try.py --bk mxnet

Output:

tensorflow: [[-0.05560498  0.03696749  0.0953454  -0.04178171  0.00761575]]
theano: [[-0.04984551  0.01884622 -0.0549684  -0.16306187 -0.06385145]]
cntk: [[-0.05105171  0.10062361 -0.04274062 -0.10008237  0.20301312]]
mxnet: [[0. 0. 0. 0. 0.]]

Related backend version is:

tensorflow-gpu==2.0.0
theano==1.0.5
cntk-gpu==2.7
mxnet==1.8.0

Problem exists when I use either keras 2.2.4 or keras 2.3.

rmothukuru commented 3 years ago

@maybeLee, As per my understanding, this is expected because each Backend uses different Compilers and different Weight Initialization Methods, and they might be implemented in different Languages as well.

However, getting Reproducible Results using 1 Backend (example Tensorflow) is possible, as shown in this comment. Thanks!

maybeLee commented 3 years ago

@rmothukuru, Thanks for your reply, to reduce the influence of weight initialization and randomness, I built a one-layer (LSTM) model and save its weights as model.h5 for different backends to use.

import os
np.random.seed(10)
if not os.path.exists("model.h5"):
    print("creating new model")
    model = keras.models.Sequential()
    model.add(layers.LSTM(5))
else:
    print("model exists, loading old model")
    model = keras.models.load_model("model.h5")
x = np.random.rand(1,10,1)
pred = model.predict(x)

The result shows that there still exists large inconsistency:

tensorflow: [[-0.05560498  0.03696749  0.0953454  -0.04178171  0.00761575]]
theano: [[-0.08818588  0.07905697  0.1577689  -0.11601128  0.00531999]]
cntk: [[-0.18581256  0.15982811  0.3080599  -0.21439359  0.01037333]]
mxnet: [[0. 0. 0. 0. 0.]]

From my perspective, this inconsistency issue is quite important, when developers are running pretrained models on different backends, they expect the output to be similar.

tomerk commented 3 years ago

Hi @maybeLee, support for non-TF backends was dropped multiple keras versions ago so we would suggest only paying attention to the TF backend.

As for why they may have differed:

maybeLee commented 3 years ago

Hi @tomerk, thanks for your explanation. As I tested earlier, when I save the LSTM layer and its weight as a model, such inconsistency issue still exists when using different backend to run inference on such model. Besides, I am not sure simply performing model inference will introduce any variance factor. Another possible explanation for such inconsistency can be: for different backends, the implementations of the LSTM layer are different. I understand this may be difficult to debug, however, it is very essential if there exists bugs in the implementation of LSTM.

Anyway, I will do more experiments to see if I can help you identify the root cause of such inconsistency.

tomerk commented 3 years ago

@maybeLee, If you would like to debug the Tensorflow implementation specifically to see if anything misbehaves then feel free and please do file a separate bug if you spot that it is misbehaving. (Please use the newest version of both TF and Keras to try this instead of TF 2.0).

As for the other backends: As I mentioned above they are no longer under development. Even if you do find an issue with their existing implementations there is no repository to submit the changes to and no new releases will be made.

@rmothukuru please feel free to close this issue as it is not actionable for us at this time

rmothukuru commented 3 years ago

@maybeLee, Closing the issue as per @tomerk's comments. Please file a new bug as per his suggestion. Thanks!