rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.82k stars 853 forks source link

Using pandas dataframes in bias-variance-docmposition #1070

Open prateek-bricklane opened 8 months ago

prateek-bricklane commented 8 months ago

Describe the workflow you want to enable

Using bias variance decomposition with pandas dataframe. Since scikit learn now supports pandas api train test splits are available as pandas dataframes for some workflows. Raising an error when passing these as inputs to bias_variance_decomp, just makes us use an extra step outside mlxtend that is out of context of more general workflow (that is accomplished entire in pandas dataframes).

Describe your proposed solution

Instead of raising error with a message, convert pandas dataframes to numpy arrays internally.

Describe alternatives you've considered, if relevant

An alternative would be a config parameter for the package like scikit-learn that can more generally handle dataframes across range of functionalities either by converting to numpy arrays or implementing functionalities compatible with dataframes.

Additional context

rasbt commented 8 months ago

Thanks for the suggestion. This would be a nice addition indeed. Unfortunately, I am a bit overcommitted and don't know when/if I would have time to implement this.

prateek-bricklane commented 8 months ago

Happy to give it a try :)

rasbt commented 8 months ago

If you have time and are interested to work on this, I'd appreciate the contribution 😊