Closed lacava closed 7 years ago
I think it would be cleaner to use a Pipeline for this:
from sklearn.linear_model import LassoLarsCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
steps = ("scaler", StandardScaler()), ("estimator", LassoLarsCV())
model = Pipeline(steps)
The api is still the same: model.fit(x_train, y_train), model.predict(x_test)
You could even write a Transformer which takes a set of expressions/functions and transforms x to the features.
steps = ("features", MyTransformer(exprs)), ("scaler", StandardScaler()), ("estimator", LassoLarsCV())
Using the model down the line becomes much simpler, e.g. saving it and using it for estimation in a different context, as everything you need it contained in the pipeline object.
that's a good point, we should use the sklearn Pipeline for this, and for our transformations. right now predict()
manually transforms then calls predict on the best estimator. it should all be combined into one sklearn Pipeline.
fixed in commit 9124540
normalize feature transformations automatically before feeding them into the ML fit method. store the transformer so that it can be used in prediction/transformation as well.