Closed dipanjannag closed 3 years ago
@rrki sorry, I forgot about this. The difference is the optimizer and gradients. model.save saves both, when you use model.load_weights you ignore the optimizer from the file, and hence it is not saved.
If you have pytables installed you can use the CLI ptdump and pttree to inspect the content of the files.
In case anyone is still running into this problem, I was dealing with this for a while because I did not realize Python 3.3 and up has non-deterministic hashing between runs (https://stackoverflow.com/questions/27954892/deterministic-hashing-in-python-3). I was doing my own preprocessing through nltk
, and then using the native Python hash
method to convert words to integers before passing them to my Embedding
layer, which ended up being the issue of non-determinism.
I'm still with this issue. It has no difference between load and not load weights. Both weights are the same.
I just don't know what to do, since I already tried load_model and load_weights
I'm using tf and keras btw...
Facing exactly the same issue while saving and loading the model itself or the weights of it. Both resulting in completely different results.
Python: v3.5.3 Tensorflow: v1.3.0 Keras: v2.0.8
Hi all,
I've just finished fighting the battle with this problem and overall not having consistent results while using evaluate_generator (if I execute it multiple times in a row, the results vary). With me the problem was the following -batch_size
was not a divisor of number_of_samples
! It took me ages to figure this one out -
steps = math.ceil(val_samples/batch_size)
Due to the fact that the batch_size was not a divisor of number_of_samples I assume it took different samples to fill in the last step. Some small errors occured also from using workers
variable - using GPU it makes no sense to actually use it. Once I used a real divisor of the val_samples it worked like a charm and reproducible - before and after loading!
Unfortunately nothing that I tried helped. Still face the same issue. Even on a really simple example like this:
model = Sequential()
model.add(Conv2D(16, (4, 4), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(2))
model.add(Activation('softmax'))
optimizer = RMSprop()
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.fit(x_train,y_train,epochs=10)
print(model.evaluate(x_val, y_val))
model.save('test.h5', overwrite=True)
[1.0409752264022827, 0.67000000047683717]
Now loading the model back again
from keras.models import load_model
model = load_model('test.h5')
print(model.evaluate(x_val, y_val))
[0.72732420063018799, 0.26000000000000001]
That's just for demonstration
Just to be sure, I also tried with my training dataset [aka: evaluate the model before and after saving with the same dataset that I used for training.]
{'acc': 0.73999999999999999, 'loss': 0.57565217232704158}
{'acc': 0.88403865378207269, 'loss': 0.59617107459932062}
@sanosay Thank you for providing a full example, but I cannot reproduce. Can you provide some data?
from keras.models import load_model, Sequential
from keras.layers import Conv2D, Activation, Flatten, Dense
from keras.optimizers import RMSprop
import numpy as np
x_train = np.random.randn(100, 10, 10, 2)
y_train = np.zeros((100, 2))
y_train[:, np.argmax(np.median(x_train, axis=(1, 2)), axis=1)] = 1.
x_val = np.random.randn(30, 10, 10, 2)
y_val = np.zeros((30, 2))
y_val[:, np.argmax(np.median(x_val, axis=(1, 2)), axis=1)] = 1.
model = Sequential()
model.add(Conv2D(16, (4, 4), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(2))
model.add(Activation('softmax'))
optimizer = RMSprop()
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.fit(x_train,y_train,epochs=10)
print(model.evaluate(x_val, y_val))
model.save('test.h5', overwrite=True)
model = load_model('test.h5')
print(model.evaluate(x_val, y_val))
Gives:
[1.4226549863815308, 0.63333332538604736]
[1.4226549863815308, 0.63333332538604736]
Python: v3.5.4
Tensorflow: v1.3.0
Keras: v2.0.8
I'll test on the GPU.
@Dapid Unfortunately due to the nature of the dataset [medical] I have no license to upload them anywhere. Dataset size is: 6002 images and 500 validation images Images: 64x64x3
I also noticed something really strange [just now]: Training a different model [same dataset] for different epoch numbers I get :
Before
{'metrics':{'acc': 0.96867710763078974, 'loss': 0.10006937423370672}}
After
{'metrics': {'acc': 0.11596134621792736, 'loss': 0.73292944400320847}}
Before:
{'metrics': {'acc': 0.98367210929690108, 'loss': 0.045077768838411421}}
After:
{'metrics': {'acc': 0.11596134621792736, 'loss': 1.1414862417133995}}
The accuracy remains the same, while loss is changing.
I tried the same model [and different models] on different machines, and having the same issue. Also tried the same with tensorflow and tensorflow-gpu, just in case
What do you get with my synthetic data? Is it consistent? I can play with the sizes and number of images to see if I can get it to misbehave.
@Dapid That is indeed strange, I with your generated dataset [seed to 0] and I can't reproduce it. I also tried adjusting it to 1001 instead of 100 and 302 instead of 30 [to see if it's affected somehow by batch size etc]. No issue with the results. I then tried with a different dataset (cifar10) and I get inconsistent results
Ok, CIFAR is good, I can see if it works funny for me.
Which model are you using on CIFAR?
from keras.models import load_model, Sequential
from keras.layers import Conv2D, Activation, Flatten, Dense
from keras.optimizers import RMSprop
import numpy as np
from keras.datasets import cifar10
(x_train, y_train), (x_val, y_val) = cifar10.load_data()
model = Sequential()
model.add(Conv2D(16, (4, 4), padding='same', input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(10))
model.add(Activation('softmax'))
optimizer = RMSprop()
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
model.fit(x_train,y_train,epochs=10)
print(model.evaluate(x_val, y_val))
model.save('test.h5', overwrite=True)
model = load_model('test.h5')
print(model.evaluate(x_val, y_val))
[14.506285684204101, 0.10000000000000001]
[14.506285684204101, 0.10000000000000001]
I experience the same phenomenon. After saving saving and loading the weights the loss value increases significantly.
Moreover, I don't get consistent behaviour when loading the weights. For the same model weight file, I get dramatically different results every time I load it in a different Keras session. Can it be something linked to numerical precision?
I'm using Keras 2.0.8. Unfortunately I don't know how how to provide a minimal working example.
I'm also having the performance issues when saving and loading a model.
I'm facing the same issue, Any idea how to fix this?
I'm facing the same problem. I have an LSTM layer in the model I verified that the weights are loading properly by comparing them before and after with bcompare. But the output results are different after loading the weights. Before introducing the LSTM layer my results were reproducible.
I'm having the same issue with the value of the loss function after reloading a model, as described by @darteaga. My model was saved via ModelCheckPoint with both save_best_only and save_weights_only being False. It was then loaded with keras.models.load_model function, which then gave significantly high loss value during the first training epochs.
I'm using python3 and keras 2.0.8. Any suggestion how to fix this would be highly appreciated.
I have the same issue. See attachment. Model was reloaded at epoch 51 as well as few times around 3-5. I do not have a LSTM layer.
My network is based on xception for transfer learning. From the following thread, the issue might be due to tensorflow with python3: https://github.com/tensorflow/tensorflow/issues/6683
@pickou: "I train and store the model in python2, and restore it using python3, it got terrible result. But when I restore the model using python2, the result is good. Train and store in python3, I got awful result."
I tried python 2.7.12 with tensorflow 1.2, 1.3 and 1.4 (master), Keras 2.1.1, Ubuntu 16.04 LTS and I still have the same issue with unexpected high loss value after reloading the model.
Chiming in. I'm running into the same problem, but it's GPU implementation only from what i've seen. I trained a model on both CPU and GPU. Saved them off. Then tested them in a cross.
CPU v CPU : saved model and loaded model agree CPU v GPU : saved model and loaded model disagree GPU v CPU : saved model and loaded model disagree GPU v GPU : saved model and loaded model disagree
I'm going to attempt the suggestions and see if they resolve the issues
I should be more specific. I train and save the model in one session, then open another session, load the saved model and test on the same data. I can't share the data unfortunately, but the model is a relatively simple sequential FFNN. No recurrence or memory neurons.
I did some more testing to see what was happening. I think this is an issue with jupyter notebook and keras/tensorflow interacting in a manner which is not predicted.
To test, i created and trained identical models in both an external file and in a jupyter notebook. I saved off their predictions, and the model itself using the self.save() method inherent to Keras models. Then in another file/notebook I loaded the test data used in the training file/notebook, loaded the model from their respective training partner, loaded the saved predictions, and then used the loaded model to create an array of predictions based on the test data.
I took the difference of the respective predictions, and found that for the files created in an IDE and run via command line, the results are identical. The difference is zero (to machine tolerance) between the predictions created in the training file and the predictions created by loading the saved model.
For the Jupyter Notebook version though, this isn't true. There is significant difference between the training file predictions and the loaded model predictions.
Interesting note though, is that when you load the model trained and saved via command line and the predictions created via command line in a jupyter notebook and take the difference there, you find it is zero.
Using the model.model.save() method also results in incorrect results in a jupyter notebook. Using keras.save_weights() generates the exact values found in model.save() in a jupyter notebook.
Using the manual_variable_initialization() method suggested by @kswersky may be a work around, but it seems like a bit of a clever hack for something that should work out of the box. I haven't gotten it to work using just Keras layers though.
@vickorian,
Thanks for your efforts.
If I understood correctly you connect the error with mixing saving/loading in different programs (command line and jupyter for example).
If that is the case then my example is a counter example.
My layout was that I trained several models (~100), recorded some statistics and then saved them. At a later point I wanted to record some more statistics on them so I loaded them up. The strange thing is that not every re-loaded model had its scores mixed up just some of them. In any case all this process was done through an Anaconda Prompt on Windows.
No, the problem is with just the notebooks. If you save on a notebook, you'll get bad results
Hello guys,
I was facing the same issue until I set my ModelCheckpoint to save only weights (save_weights_only=True). e.g.:
checkpoint = ModelCheckpoint(file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max', save_weights_only=True)
After this, I tested my best model using a python script through the terminal, and I got a good prediction.
I haven't run the model checkpointing at all. I have been just training model by creating files and running python from command line instead of using Notebooks.
Having the same issue. Trained a Keras LTSM model, saved the weights. Start a new standalone process reconstructing the model and loading the weights to check evaluation/prediction result, and it gets a different result every time it runs unless I fix numpy random seed.
@chunsheng-chen will you try writing a stand-alone file and run it from command line to see if you get different results? Also, when do you fix the random seed? Is it immediately after importing numpy?
@vickorian,
My issue's methodology was exactly that:
Fixed the numpy random seed after the imports, as a first statement on the
main()
function. E.g.:
import numpy as np
def main():
np.random.seed(0)
# Rest of code
if __name__ == '__main__':
main()
@fmv1992
Interesting. Your models were still getting errors even when run from command line. My test setup is Ubuntu 16.04. I have the samba server set up so I can do further testing for my process on windows without much trouble.
The errors disappear when you set the random seed?
@vickorian,
They did not go away. I have set both numpy and python random seeds.
Unfortunately I'm developing at work so I feel uncomfortable to post the entire code here. But the idea is that I have several functions of the type:
def myfunc(shape):
n_lines, n_columns = shape
model = Sequential()
model.add(Dense(
np.random.randint(1000, 2000),
input_dim=n_columns,
activation='sigmoid',
kernel_initializer='glorot_uniform',
bias_initializer='Zeros',
use_bias=True))
model.add(GaussianDropout(0.2))
model.add(Dense(1, activation='sigmoid'))
optimizer = SGD(lr=0.1, momentum=0.6, decay=1e-4, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=optimizer,
metrics=[auc2])
return model
which get trained in a for loop then saved with:
model.save(model_full_name)
After saving all of them get loaded and then evaluated. So no direct evaluation after training. All go through the serialization/de-serialization stage.
The puzzling results are in the attached image. See that many of the models have a ROC AUC of ~ 0.5 and this is out of the trend of the overall plot. This image brought me here.
I have tried several workarouds cited here and some place else but none of them worked.
Also as you can see the loading error does not happen for 100% of the cases (i.e. some models have reasonable ROC AUCs).
@fmv1992
It would be interesting to see the ROC AUC scores when tested before serialization. Is that possible?
I meet the same problem. Keras: 2.0.4 Jupyter notebook Tensorflow-gpu:Version: 1.3.0
Model: def create_UniLSTMwithAttention(X_vocab_len, X_max_len, y_vocab_len, y_max_len, hidden_size, num_layers, return_probabilities = False):
model = Sequential()
model.add(Embedding(X_vocab_len,300, input_length=X_max_len,
weights = [g_word_embedding_matrix],
trainable = False,mask_zero = True))
for _ in range(num_layers):
model.add(LSTM(hidden_size,return_sequences = True))
model.add(AttentionDecoder(hidden_size,y_vocab_len,return_probabilities = return_probabilities))
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
return model
Actions like:
@vickorian,
I'm trying to revive the code. Will post the results here as soon as I have them.
However I think that the problem is pretty much "established" and I fail to see how those results would help.
And unfortunately this issue is a real show stopper. If you are using Keras for your homework assignment and training a 5 minute model this is not an issue. However if you are on an enterprise setting this is a huge deal as the models will necessarily be serialized to be transferred to a "production environment" or something. And also the entire project reputation becomes jeopardized by this behavior.
And please, please don't get me wrong. Keras seems to be a great project. Indeed it is very unfortunate that this is happening. I want to be as helpful as possible.
One thing really bothers me though: Why don't use the simple pickle module?
Best,
It seems that Keras will not overwrite the existed model file. Tried to use early stopping and save only model after stopping. Make sure your model directory is empty before training. I had similar issue and this solved my problem. Hope it helps.
@ryanzh13 Yes I save_weights after training stop and I make sure there is no other file in the directory. The training take 8 hours and I can only pray the instance won't be corrupted.
@darcula1993 Have you tried save instead of save_weights? Another thing is to try the code on a smaller dataset and less epochs. Also, you can print out the model.summary() to see if the model parameters are the same before saving and after loading.
@ryanzh13 Since there is a custom layer, I didn't. I will give it a try when I finish the training and submit my assignment. I already past the deadline.
I'm also having a problem saving my model or my weights. They "save" but when loaded back in the values are clearly garbage. For saving weights I tried reloading them into a replica model architecture and then doing a batch of predictions, results in junk values. I tried saving a full model with model.save() and reloading it and running a prediction gives junk values. The only time my prediction results look valid is while in the same run(session) and when not saving the whole model. If I try to save the whole model then even the same run predictions are junk. So something is happening when saving the model and/or loading the model. My model consists of regular Conv2d, max pooling, and dense layers. I'm guessing something is happening either to the weights or the training config data. I'm still new to all this so I'm not sure but the model consists of three things right: weights, training config, structure? I should also specify that I'm using the model.predict_generator() for my predictions.
Ubuntu 16.04 LTS Tensorflow v1.4.1 Keras v2.1.2 Python 2.7 (anaconda)
@rsmith49 Thanks for your solution, I faced the same problem when training a text classifier using LSTM. Already got stuck for a long time and I found the problem is the word dict: I didn't enforce the dict to have the same id for one word in different session, the solution is either dump the dict using pickle or sort the words before assigning id to them.
I also had the problem of constant output after loading a model at Keras 2.0.6.
I've upgraded to Keras 2.1.2 and used the preprocess_input
and now it works.
Below is the output of the model with and without using the preprocess_input
function.
Using preprocess_input
at Keras 2.0.6 didn't work for me and I had to upgrade to Keras 2.1.2.
Output:
Without Preprocess:
[[ 0. 1. 0.]]
[[ 0. 1. 0.]]
With Preprocess:
[[ 5.07961657e-24 1.00000000e+00 3.09985791e-28]]
[[ 0.00508418 0.00213011 0.99278569]]
from keras.preprocessing.image import load_img, img_to_array
from keras.applications.imagenet_utils import preprocess_input
import matplotlib.pyplot as plt
from keras.models import Model, load_model
import numpy as np
def readImg(filename):
img = load_img(filename, target_size=(299, 299))
imgArray = img_to_array(img)
imgArrayReshaped = np.expand_dims(imgArray, axis=0)
imgProcessed = preprocess_input(imgArrayReshaped, mode='tf')
return img, imgProcessed
def readImgWithout(filename):
img = load_img(filename, target_size=(299, 299))
imgArray = img_to_array(img)
imgProcessed = np.expand_dims(imgArray, axis=0)
return img, imgProcessed
sidesModel = load_model('C:/Models/Xception.hdf5', compile=False)
img1, arr1 = readImgWithout('c:/Test/image1.jpeg')
img2, arr2 = readImgWithout('c:/Test/image2.jpeg')
prob1 = sidesModel.predict(arr1)
prob2 = sidesModel.predict(arr2)
print('Without Preprocess:')
print(prob1)
print(prob2)
img1, arr1 = readImg('c:/Test/image1.jpeg')
img2, arr2 = readImg('c:/Test/image2.jpeg')
prob1 = sidesModel.predict(arr1)
prob2 = sidesModel.predict(arr2)
print('With Preprocess:')
print(prob1)
print(prob2)
I was able to resolve this issue by adapting my preprocessing pipeline.
I tried saving the model, clearing the session, then loading the model, and then calling the prediction function on the training and validation sets from when I trained the model. This gave me the same accuracy.
If I imported exactly the same data, but preprocessed it again (in my case using a Tokenizer for a text classification problem) the accuracy dropped drastically. After some research, I assume this was because the Tokenizer assigns different ids to different tokens unless they are trained on exactly the same dataset. I was able to achieve my training accuracy (~.95) on newly imported data in a new session, provided I used the same Tokenizer to preprocess the text.
This may not be the underlying problem for all above cases, but I suggest checking your preprocessing pipeline carefully and observing if the issue remains.
I'm facing the same issue. If I recreate the model (ResNext SE from https://github.com/titu1994/keras-squeeze-excite-network) from scratch and use load_weights and then use model.predict everything works as expected. If I use load_model first and then use load_weights on top of that (I have different sets of weights) the model predicts garbage. I checked that in both cases the weights are the same (through model.get_weights). I use Keras 2.1.3 and Tensorflow 1.4.0
This works:
K.clear_session()
model=SEResNext(**model_params)
model.compile(Adam(1e-4), 'binary_crossentropy', metrics=[tf.losses.log_loss])
model.load_weights('1694.hdf5')
pred=model.predict(train_set)
print(log_loss(y_true=train_y,y_pred=pred))
This doesn't:
model=keras.models.load_model(model_name,custom_objects={'log_loss': tf.losses.log_loss})
model.load_weights('1694.hdf5')
pred=model.predict(train_set)
print(log_loss(y_true=train_y,y_pred=pred))```
After 2hrs of struggling, I found this inconsistency can be related to K.batch_set_value() not working if multiple python kernels (with tf imported) are running on the same machine, and resolved if all but one are closed.
@ludwigthebull I am pickling the tokenizer and loading it to tokenize the text I want to predict, but I'm still getting random predictions. With your pipeline, are you running the model on the same session? If not, can you let us know how are you saving and loading the model?
@dterg I am not running the model in the same session. I am loading and saving the model as an .h5 file using the standard model.save('model_name') and load_model('model_name') Keras functions. I should add that I am not pickling the Tokenizer but instead rebuilding the tokenizer each time I load data by having a large text file as the common reference for the tokenizer. This is inefficient, but I haven't gotten around to writing a function that pickles the Tokenizer for me (I don't think Keras has this option.) In your case, I would try to see if you can get the same predictions by loading the model in a new session, but instead of pickling the tokenizer, just recreating it in the new session by using a common text file for both the training and the prediction phase. It may be that your issue has to do with the way you are pickling the tokenizer. Hope that helps !
I'm struggling again on the same problem on a different model. Both models consist on a series of linear convolutional filters (same filter reused many times) followed by non-linear convolutional filters. While training, after each epoch I save weights using the checkpoint
callback.
Early in the training process I can load the weights saved to disk and I get reliable and consistent results, but beyond some training saved filter coefficients lead to almost random values of the loss function (as if there had been no training at all). Moreover the loss value is different every time I load the weights in a different Keras session.
I don't know what to do. Most workarounds suggested here rely on having LSTM layers (I don't have any) or data preprocessing (I don't do it). I find the problems while saving both the full model or only the weights.
I have built a few similar models and I haven't found this problem in all models consistently. Hence I don't know how to provide a minimally working example. I would be willing to do any testing to help solving the bug.
It is currently a major issue for me because I rely on Keras for my research and after this bug I have found myself unable to continue working.
I am using Keras 2.1.1 under Python 3.5.2. I have found the problem both with the Tensorflow (1.2.0) and Theano (0.9.0) backends.
please try first load the model and then the weigths using:
Save code
model_json = model.to_json() with open("model.json", "w") as json_file: json_file.write(model_json)
model.save_weights("model.h5") print("Saved model to disk")
Load code
json_file = open('model.json', 'r') loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json)
loaded_model.load_weights("model.h5") print("Loaded model from disk")
reference: https://machinelearningmastery.com/save-load-keras-deep-learning-models/
The problem is: If you are using tokenizer (from Keras), because Keras applies an unique index for each word but then if you load the model and use tokenizer applies another index for each word. The solution is save the original word_index and then load to tokenizer with these index.
I had the same problem. Turns out, the problem wasn't with my LSTM, but with my pre-trained word vectors. I preprocessed my corpus using FastText, and since it is a non-deterministic model, each run of Skip-Gram gives a different set of word vectors. Since we are dealing with LSTM, I'm pretty sure a lot of folks out there are doing some kind of word2vec. Make sure that your word vectors are the same each time. Hope this helps!
I am trying to save a simple LSTM model for text classification. The input of the model is padded vectorized sentences.
For saving I'm using the following snippet:
But whenever I'm trying to load the same model
I'm getting random accuracy like an untrained model. same result even when trying to load weights separately. If I set the weights
like above. It is working if
model
andmodel2
are run under same session (same notebook session). If I serializelstmweights
and try to load it from different place, again I'm getting result like untrained model. It seems saving only the weights are not enough. So why model.save is not working. Any known point?