smazzanti / mrmr

mRMR (minimum-Redundancy-Maximum-Relevance) for automatic feature selection at scale.
MIT License
531 stars 79 forks source link

FunctionTransformer with mrmr_regression & gridsearchCV Issue #18

Closed calvs01 closed 2 years ago

calvs01 commented 2 years ago

Hello,

I attempted to use mrmr_regression in a pipeline with gridsearchCV to optimize the argument K as a hyperparameter and ran into the following issue:

I made a function that would return just dataframe with a sparse feature set. Then, I used FunctionTransformer to convert this into a transformer to be used in a pipeline. After adding some arbitrary sklearn model (sklearn.kernel_ridge.KernelRidge), to the pipeline and trying to use gridsearchCV, it returned the following error: 'numpy.ndarray' object has no attribute 'columns'. The same error came up when trying to call "pipe.fit()" without gridsearchCV.

I think this is referring to the fact that your function 'parallel_df' called in 'f_regression' uses df.columns, and gridsearch might be trying to feed it a 2d array. Do you have any suggestions on how to get around this issue?

Thank you very much

mrmr gridsearch issue

calvs01 commented 2 years ago

Hello,

Nevermind the issue was simply the output of StandardScaler...