yzhao062 / pyod

A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
http://pyod.readthedocs.io
BSD 2-Clause "Simplified" License
8.57k stars 1.37k forks source link

How can I save a pyod model? #88

Open singyaowu opened 5 years ago

singyaowu commented 5 years ago

I've just trained a auto-encoder model, and I wonder how can I save the model so that I don't need to train it again next time I want it. I didn't see any function related to save a model in auto_encoder.py, so I'm not sure if there is a function which I can use to save my model. Do you implement this kind of function?

yzhao062 commented 5 years ago

Agreed that a model save functionality should be added. Marked as a todo task. I am not sure whether pickle will work or not (hopefully yes), and I will also do some tests.

osancus commented 5 years ago

When trying to save AutoEncoder model using Pickle, Following error occurs. Any idea how can I fix it?

TypeError: can't pickle _thread.RLock objects

#Code
clf = fit_model(X_train)
pickle.dump(clf, open('./autoencoder.h5', 'wb'))
yzhao062 commented 5 years ago

@epicsol-inc sorry for the late response. AE in pyod is written with keras, and saving the model can be tricky.

To my understanding, keras models may not be pickable (https://github.com/keras-team/keras/issues/10528)...

If saving model is a must, you may have to copy the code out from auto_encoder.py directly. Sorry for the inconvenience..

sbysiak commented 5 years ago

@epicsol-inc I managed to save it using dill (https://pypi.org/project/dill/), which has syntax very similar to pickle

with open(out_fname, 'wb') as f: dill.dump(model, f, dill.HIGHEST_PROTOCOL)

You can check if it works in your case

yzhao062 commented 5 years ago

@sbysiak Thanks for the note. Much appreciated. Will also check out it and consider add this to the documentation :)

lgo7 commented 5 years ago

Any news regarding save PyOD models? I need to save an IForest model, can I use Pickle?

yzhao062 commented 5 years ago

Any news regarding save PyOD models? I need to save an IForest model, can I use Pickle?

Sorry I have not tested it out which should be. If pickle is not working, I will say using "https://pypi.org/project/dill/" as mentioned above.

This will be listed on the top of my priority list now.

lgo7 commented 5 years ago

I've used picke.dump and worked!

AlexDelPab commented 4 years ago

I've also used pickle.dump() for the classifiers knn, oc-svm, iforest and fabod and it works saving and loading them with:

save: pickle.dump(clf, open(folder + clf_name + '.h5', 'wb')) load: pickle.loads(open( folder + 'k Nearest Neighbors (kNN).h5', 'rb').read())

bhowmiks commented 4 years ago

Pickle and dill can save successfully. But these formats can make it time consuming to load the model. For autoencoder model, I saved the weights as HDF5 and the classifier object as pickle for faster loads and less disk space.

from pyod.models.auto_encoder import AutoEncoder
autoenModel= AutoEncoder()
autoenModel.fit(X=x_train)

##serialize model to JSON
model_json = autoenModel.model_.to_json()
with open(model_path+".json", "w") as json_file:
  json_file.write(model_json)
##serialize weights to HDF5
autoenModel.model_.save_weights(model_path+"model.h5")

then set autoencoder model to None. It makes it smaller

autoenModel.model_ = None
with open(newpath+"//"+model_name+"_model"+'.pickle', 'wb') as handle:
  pickle.dump(autoenModel, handle, protocol=pickle.HIGHEST_PROTOCOL)

Model Load

load the auto encoder instance

with open(path + "//" + model_n+"_model" + ".pickle", 'rb') as handle: loaded_model = pickle.load(handle)

# load json and create model
json_file = open(path + "//" + model_n + '.json', 'r')

loaded_model_json = json_file.read()
loaded_model_json = loaded_model_json.replace("\"ragged\": false,", " ")
json_file.close()
loaded_model_ = model_from_json(loaded_model_json)
# load weights into new model
loaded_model_.load_weights(path + "//" + model_n + "model.h5")
print("Loaded model from disk")

loaded_model.model_ = loaded_model_   ## Set the loaded model to the auto encoder instance model

This works almost 5x faster and model size is 10X smaller.

ezzeldinadel commented 3 years ago

loadedmodel = model_from_json(loaded_model_json)

what is model_from_json? this https://www.tensorflow.org/api_docs/python/tf/keras/models/model_from_json ?

SaqlainHussainShah commented 3 years ago

I have tried with .pkl and .h5 extension along with dill, pickle and joblib but the issue persists

Unable to save model can't pickle _thread.RLock objects

lfvillavicencio commented 2 years ago

Pickle and dill can save successfully. But these formats can make it time consuming to load the model. For autoencoder model, I saved the weights as HDF5 and the classifier object as pickle for faster loads and less disk space.

from pyod.models.auto_encoder import AutoEncoder
autoenModel= AutoEncoder()
autoenModel.fit(X=x_train)

##serialize model to JSON
model_json = autoenModel.model_.to_json()
with open(model_path+".json", "w") as json_file:
  json_file.write(model_json)
##serialize weights to HDF5
autoenModel.model_.save_weights(model_path+"model.h5")

then set autoencoder model to None. It makes it smaller

autoenModel.model_ = None
with open(newpath+"//"+model_name+"_model"+'.pickle', 'wb') as handle:
  pickle.dump(autoenModel, handle, protocol=pickle.HIGHEST_PROTOCOL)

Model Load

load the auto encoder instance with open(path + "//" + model_n+"_model" + ".pickle", 'rb') as handle: loaded_model = pickle.load(handle)

# load json and create model
json_file = open(path + "//" + model_n + '.json', 'r')

loaded_model_json = json_file.read()
loaded_model_json = loaded_model_json.replace("\"ragged\": false,", " ")
json_file.close()
loaded_model_ = model_from_json(loaded_model_json)
# load weights into new model
loaded_model_.load_weights(path + "//" + model_n + "model.h5")
print("Loaded model from disk")

loaded_model.model_ = loaded_model_   ## Set the loaded model to the auto encoder instance model

This works almost 5x faster and model size is 10X smaller.

Hi! Where do you import that function model_from_json ? thx