Open AlexandreGazagnes opened 8 months ago
I think that we can achieve the the expected behaviour if we have the following PR in: https://github.com/scikit-learn/scikit-learn/pull/27722
You can do a pipeline with this SelectThreshold
using a skweness function and then a FunctionTransformer
by passing the np.log1p
since we don't need to hold any state.
Describe the workflow you want to enable
Using a pipeline and a grid search, i want to check if it is better to pass log1p functions some columns, depending a skew threshold.
The code should go like this for the pipeline :
for the grid_search :
and of course :
Describe your proposed solution
I have already implemented such a class and it works.
I Think this is not a sufficient qa code to be intergrated to sklearn but such a feature should be a good idea.
An indicative source code could be found here : file
Just as an option, here's the code :
Describe alternatives you've considered, if relevant
Additional context
A example notebook could be fond here : notebook