Hi! I found that the result of model.predict() does not match the output value of the last layer of model. If I use the model.predict API directly, the output of the model would be NaN. But if I get the outputs of the intermediate layers in model , I find that the outputs of all intermediate layers are normal (including the last layer). I think the result of model.predict() and the output value of the last layer of model should be equivalent. I am wondering what makes this difference.
To Reproduce
The code to reproduce is shown below:
import os
import numpy as np
from PIL import Image
bk = 'mxnet'
os.environ['KERAS_BACKEND'] = bk
import keras
from keras import backend as K
print("Using backend :{}".format(K.backend()))
def custom_objects():
def no_activation(x):
return x
def leakyrelu(x):
import keras.backend as K
return K.relu(x, alpha=0.01)
objects = {}
objects['no_activation'] = no_activation
objects['leakyrelu'] = leakyrelu
return objects
def image_resize(x, shape):
x_return = []
for x_test in x:
tmp = np.copy(x_test)
img = Image.fromarray(tmp.astype('uint8')).convert('RGB')
img = img.resize(shape, Image.ANTIALIAS)
x_return.append(np.array(img))
return np.array(x_return)
base_model = keras.models.load_model("mymodel.h5", custom_objects=custom_objects())
base_model.summary()
adam = keras.optimizers.Adagrad(lr=0.01, epsilon=None, decay=0.0)
base_model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
imagename = '183_bucket.png'
img = Image.open(imagename)
img = img.resize((224,224), Image.ANTIALIAS)
x_test = np.array(img)
x_test = x_test/255.0
select_data = np.expand_dims(x_test, axis=0)
prediction = base_model.predict(select_data)
print(prediction)
# obtain the output of intermediate layers
# skip InputLayer
get_layers_output = K.function([base_model.layers[0].input, K.learning_phase()],
[l.output for l in base_model.layers[1:]])
layers_output = get_layers_output([select_data,0])
# skip InputLayer
layers = base_model.layers[1:]
for idx,layer in enumerate(layers):
print(f"# layer {idx} {layer.name}:")
print(f"min:{np.min(layers_output[idx])}, max:{np.max(layers_output[idx])}")
Output of model.predict() :
[[nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
Description
Hi! I found that the result of model.predict() does not match the output value of the last layer of model. If I use the model.predict API directly, the output of the model would be NaN. But if I get the outputs of the intermediate layers in model , I find that the outputs of all intermediate layers are normal (including the last layer). I think the result of model.predict() and the output value of the last layer of model should be equivalent. I am wondering what makes this difference.
To Reproduce
The code to reproduce is shown below:
Output of model.predict() :
Output of the last three layers in the model:
I don't know why the output of model.predict() is NaN while that of the last layer is not.
The model and picture used in the code are uploaded here: data.zip
Thanks in advance
What have you tried to solve it?
I use the latest version of mxnet (1.6.0), but the problem still exists.
Environment
You can use the following command to configure the environment