picnicml / doddle-model

:cake: doddle-model: machine learning in Scala.
https://picnicml.github.io
Apache License 2.0
137 stars 23 forks source link

Add the unfit method to the estimator API #103

Open inejc opened 5 years ago

inejc commented 5 years ago

When the best performing model is returned from grid/random search and it is evaluated on the test set, a user might want to retrain it on a whole dataset with the same hyperparameters. Currently, one would have to inspect what hyperparameters were selected and this is problematic in some cases, esp. in a Pipeline where types of the original transformers and the predictor are lost. For that reason, we want to expose the .unfit() method, which would create an unfitted estimator with the same hyperparameters.

Example usage:

val split = splitData(x, y)
val selectedModel = gridSearch(split.xTr, split.yTr)
val score = f1Score(split.yTe, selectedModel.predict(split.xTe))
val finalModel = selectedModel.unfit().fit(x, y)