Inconsistency between the result of model.predict() and the output value of the last layer of the model

Description

Hi! I found that the result of model.predict() does not match the output value of the last layer of model. If I use the model.predict API directly, the output of the model would be NaN. But if I get the outputs of the intermediate layers in model , I find that the outputs of all intermediate layers are normal (including the last layer). I think the result of model.predict() and the output value of the last layer of model should be equivalent. I am wondering what makes this difference.

To Reproduce

The code to reproduce is shown below：

import os
import numpy as np
from PIL import Image
bk = 'mxnet'
os.environ['KERAS_BACKEND'] = bk

import keras
from keras import backend as K
print("Using backend :{}".format(K.backend()))

def custom_objects():

    def no_activation(x):
        return x

    def leakyrelu(x):
        import keras.backend as K
        return K.relu(x, alpha=0.01)

    objects = {}
    objects['no_activation'] = no_activation
    objects['leakyrelu'] = leakyrelu
    return objects

def image_resize(x, shape):
    x_return = []
    for x_test in x:
        tmp = np.copy(x_test)
        img = Image.fromarray(tmp.astype('uint8')).convert('RGB')
        img = img.resize(shape, Image.ANTIALIAS)
        x_return.append(np.array(img))
    return np.array(x_return)

base_model = keras.models.load_model("mymodel.h5", custom_objects=custom_objects())
base_model.summary()
adam = keras.optimizers.Adagrad(lr=0.01, epsilon=None, decay=0.0)
base_model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])

imagename = '183_bucket.png'
img = Image.open(imagename)
img = img.resize((224,224), Image.ANTIALIAS)
x_test = np.array(img)
x_test = x_test/255.0
select_data = np.expand_dims(x_test, axis=0)

prediction = base_model.predict(select_data)
print(prediction)

# obtain the output of intermediate layers
# skip InputLayer
get_layers_output = K.function([base_model.layers[0].input, K.learning_phase()],
                                  [l.output for l in base_model.layers[1:]])
layers_output = get_layers_output([select_data,0])

# skip InputLayer
layers = base_model.layers[1:]
for idx,layer in enumerate(layers):
    print(f"# layer {idx} {layer.name}:")
    print(f"min:{np.min(layers_output[idx])}, max:{np.max(layers_output[idx])}")

Output of model.predict() :

[[nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan

......

Output of the last three layers in the model:

...

# layer 90 conv_preds_copy_WS_copy_ARem_copy_LC_copy_GF: min:-17.263343811035156, max:20.603845596313477 # layer 91 act_softmax_copy_WS_copy_ARem_copy_LC_copy_GF: min:2.1734420403375793e-17, max:0.6062596440315247 # layer 92 reshape_2_copy_WS_copy_ARem_copy_LC_copy_GF: min:2.1734420403375793e-17, max:0.6062596440315247

I don't know why the output of model.predict() is NaN while that of the last layer is not.

The model and picture used in the code are uploaded here: data.zip

Thanks in advance

What have you tried to solve it?

I use the latest version of mxnet (1.6.0), but the problem still exists.

Environment

MXNet: mxnet==1.6.0
keras-mxnet: 2.2.4.2

You can use the following command to configure the environment

pip install keras-mxnet
pip install mxnet==1.6.0

awslabs / keras-apache-mxnet