adriangb / scikeras

Scikit-Learn API wrapper for Keras.
https://www.adriangb.com/scikeras/
MIT License
239 stars 47 forks source link

Initializing a pretrained model without using training data #284

Closed sfsy1 closed 1 year ago

sfsy1 commented 1 year ago

I'm trying to load a pretrained model to be use in deployment and I'm doing the following as specified in the docs. https://www.adriangb.com/scikeras/refs/heads/master/notebooks/Basic_Usage.html#4.2-Saving-using-Keras%E2%80%99-saving-methods

# Load the model back into memory
new_reg_model = keras.models.load_model("/tmp/my_model")
# Now we need to instantiate a new SciKeras object
# since we only saved the Keras model
reg_new = KerasRegressor(new_reg_model)
# use initialize to avoid re-fitting
reg_new.initialize(X_regr, y_regr)              <----- this line
pred_new = reg_new.predict(X_regr)
np.testing.assert_allclose(pred_old, pred_new)

However, it requires that I use the training data X_regr and y_regr in order to initialize the regressor. Is there any way to load the model without using training data?

adriangb commented 1 year ago

You need to give it data of the same shape. This is used to teach Scikit-Learn how many inputs/outputs the model has and such. For classifiers if any or the targets needs to be one-hot encoded we also need to get a sample with every class, but for regress it’s we quite literally just record the dimensions: https://github.com/adriangb/scikeras/blob/0144439a10cfb8b82bdd57730bc9c91904217af8/scikeras/utils/transformers.py#L336

sfsy1 commented 1 year ago

I see, thanks!