normalize feature transformations

lacava commented 7 years ago

normalize feature transformations automatically before feeding them into the ML fit method. store the transformer so that it can be used in prediction/transformation as well.

lacava commented 7 years ago

add self.scaler = StandardScaler() to init
- add self._best_scaler to init
- add scaler transformation to transform() method
add self._best_scaler that is updated when better model found
call self._best_scaler when transform fn is used in prediction

Ohjeah commented 7 years ago

I think it would be cleaner to use a Pipeline for this:

from sklearn.linear_model import LassoLarsCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

steps = ("scaler", StandardScaler()), ("estimator", LassoLarsCV())
model = Pipeline(steps)

The api is still the same: model.fit(x_train, y_train), model.predict(x_test) You could even write a Transformer which takes a set of expressions/functions and transforms x to the features.

steps = ("features", MyTransformer(exprs)), ("scaler", StandardScaler()), ("estimator", LassoLarsCV())

Using the model down the line becomes much simpler, e.g. saving it and using it for estimation in a different context, as everything you need it contained in the pipeline object.

lacava commented 7 years ago

that's a good point, we should use the sklearn Pipeline for this, and for our transformations. right now predict() manually transforms then calls predict on the best estimator. it should all be combined into one sklearn Pipeline.

lacava commented 7 years ago

fixed in commit 9124540

lacava / few

normalize feature transformations #16