adriangb / scikeras

Scikit-Learn API wrapper for Keras.
https://www.adriangb.com/scikeras/
MIT License
239 stars 47 forks source link

where to enter `validation_data=(test_x, test_y)` when using KerasRegressor() wrapper #310

Open xxl4tomxu98 opened 10 months ago

xxl4tomxu98 commented 10 months ago

With Native Keras Sequential model, I could do model.fit(..., validation_data=(test_x, test_y)...) after implementing SciKeras's KerasRegressor() wrapper, where should I enter validation syntax?

xxl4tomxu98 commented 10 months ago

I don't want to use fit__validation_split because that will cause mixing of train/validate data. So I have two completely separate datasets, one train, one validate and I want to use the validate dataset for validation of scikeras model fit

JonasHeymans commented 5 months ago

Late to the party but I had the same issue as you and asked gpt4 for a response. Here it is, hope it helped you as it has helped me.

When using the KerasRegressor wrapper from SciKeras, which integrates Keras models into the Scikit-Learn framework, the approach to specifying validation data differs from how it's done in native Keras due to the design patterns of Scikit-Learn. In Scikit-Learn, the fitting process does not natively support the direct passing of validation data through the fit method. However, SciKeras offers flexibility through its fit parameters to accommodate this.

To use a separate validation dataset with the KerasRegressor wrapper without resorting to a validation split (which, as you've mentioned, mixes training and validation data in a way you wish to avoid), you can pass the validation data through the fit_params argument of the fit method. This argument allows you to pass additional fitting parameters directly to the underlying Keras model.

`

from scikeras.wrappers import KerasRegressor

Define your Keras model building function

def build_model(): model = ... # Your model definition here return model

Instantiate KerasRegressor with your model

regressor = KerasRegressor(model=build_model, epochs=100, batch_size=10)

Fit the model with your training data and pass validation data via fit_params

regressor.fit(X_train, y_train, fit_params={'validation_data': (X_test, y_test)})

`

In this example, X_train and y_train are your training data and labels, respectively, while X_test and y_test are your validation data and labels. By specifying validation_data within fit_params, you effectively pass the validation dataset to the underlying Keras model, enabling it to evaluate model performance on this data at the end of each epoch, just as you would in a native Keras setup.

This method leverages the flexibility of SciKeras and allows you to maintain the separation between your training and validation datasets, ensuring that your validation metrics accurately reflect the model's performance on unseen data.