adriangb / scikeras

Scikit-Learn API wrapper for Keras.
https://www.adriangb.com/scikeras/
MIT License
242 stars 50 forks source link

Error when loading scikeras in scikit-learn pipeline on another device #249

Open vittot opened 3 years ago

vittot commented 3 years ago

I have a Keras model wrapped with scikeras and put in a scikit-learn pipeline, I dumped it with pickle and I'd like to be able to load it on another device, but when I do the loading with pickle I get the following error:

Traceback (most recent call last):
  File "/home/torri/WebCrowd/run.py", line 1, in <module>
    from flaskdemo import app
  File "/home/torri/WebCrowd/flaskdemo/__init__.py", line 82, in <module>
    app.dataset[d] = Dataset(os.path.join(static_path, d, d + "_descr.json"), static_path)
  File "/home/torri/WebCrowd/flaskdemo/dataset.py", line 42, in __init__
    self._load_models(descr['model_paths'], X_train)
  File "/home/torri/WebCrowd/flaskdemo/dataset.py", line 58, in _load_models
    self.models[k] = pickle.load(open(v, 'rb'))
  File "/home/torri/anaconda3/envs/WebCrowdEnv/lib/python3.9/site-packages/scikeras/_saving_utils.py", line 66, in unpack_keras_model
    model: keras.Model = load_model(temp_dir)
  File "/home/torri/anaconda3/envs/WebCrowdEnv/lib/python3.9/site-packages/keras/saving/save.py", line 205, in load_model
    return saved_model_load.load(filepath, compile, options)
  File "/home/torri/anaconda3/envs/WebCrowdEnv/lib/python3.9/site-packages/keras/saving/saved_model/load.py", line 140, in load
    loaded = tf.__internal__.saved_model.load_partial(path, nodes_to_load, options=options)
  File "/home/torri/anaconda3/envs/WebCrowdEnv/lib/python3.9/site-packages/tensorflow/python/saved_model/load.py", line 769, in load_partial
    return load_internal(export_dir, tags, options, filters=filters)
  File "/home/torri/anaconda3/envs/WebCrowdEnv/lib/python3.9/site-packages/tensorflow/python/saved_model/load.py", line 905, in load_internal
    raise FileNotFoundError(
FileNotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ram:///tmp/tmpkq48pxrn/variables/variables
 If trying to load on a different device from the computational device, consider using setting the `experimental_io_device` option on tf.saved_model.LoadOptions to the io_device such as '/job:localhost'.

What should I do? I can't see any option related to experimental_io_device in scikeras

adriangb commented 3 years ago

This is interesting. SciKeras just uses TensorFlow's SavedModel under the hood, so anything that works with TensorFlow should work with SciKeras, but I can't rule out that it's a bug in SciKeras.

Wold you be able to post some sort of example that I can test against? Also, have you tried re-loading on the same device?

Thanks!

vittot commented 3 years ago

If I reload it on the same device where it was dumped it works fine. Consider that with the keras.wrappers.scikit_learn.KerasClassifier wrapper of keras I was not able to dump it, due to the impossibility of serializing the build_fn function.

Now for the moment I reverted to the keras.wrappers.scikit_learn.KerasClassifier and I am saving and loading it in two steps:

predictors[m].named_steps['keras_clf'].model.save(filename + '.h5')
predictors[m].named_steps['keras_clf'].model = None
pickle.dump(predictors[m], open(filename, "wb"))
self.models[k] = pickle.load(open(v[0], 'rb'))
            if len(v) > 1:
                self.models[k].steps[1][1].model = keras.models.load_model(v[1])

But I don't like it very much because this requires me to manage it in a special way wrt the other scikit-learn models with which I comparing it.

It's hard to give you a working example out the application, but this is the code I used to build and fit it:

nn_clf = KerasClassifier(build_fn=get_nn_classifier, epochs=16, batch_size=32)
nn_pp = Pipeline(steps=[('features_engineering', preprocessor),
                         ('keras_clf', nn_clf)])

nn_pp.fit(X_train, y_train)

def get_nn_classifier():

    x = tf.keras.Input(shape=[18])  
    h1 = tf.keras.layers.Dense(units=50, activation=tf.keras.activations.relu)(x)
    h2 = tf.keras.layers.Dense(units=50, activation=tf.keras.activations.relu)(h1)
    out = tf.keras.layers.Dense(units=1, activation=tf.keras.activations.sigmoid)(h2)
    model = tf.keras.Model(inputs=x, outputs=out)

    loss = tf.keras.losses.BinaryCrossentropy()
    lr = 1e-3
    optimizer = tf.keras.optimizers.Adam(learning_rate=lr)
    metrics = ['accuracy']
    model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
    return model
jpgard commented 2 years ago

Also encountered this issue.

The solution for me was to serialize using dill instead of pickle or joblib, as suggested here.

adriangb commented 2 years ago

What operating system are you folks using? And what versions of TensorFlow and SciKeras?

jpgard commented 2 years ago

I am using ubuntu 18.04.1 with tf 2.10, scikeras 0.8.0. However, the models were created on other machines by other people (using the same conda environment), so I can't be sure which OS was used to create them.

adriangb commented 2 years ago

Is there any chance they're using Windows? I assume if you dump and load the model on your Ubuntu machine things work fine right?