pwwang / datar

A Grammar of Data Manipulation in python
https://pwwang.github.io/datar/
MIT License
271 stars 17 forks source link

[ENH] Adding transformation functions oriented to modeling. #200

Open coforfe opened 9 months ago

coforfe commented 9 months ago

Feature Type

Problem Description

datar includes already many useful functions that helps a lot in the cleansing and transformation data processing.

The idea would be to add new functions that would go further in this transformation process taking into consideration how the variables are used in a modeling process.

Feature Description

The idea would be to replicate the transformation functions like the ones included in the recipes package in R.

datar incorporates already a process to chain different functions together. The idea would be to add these new functions that could be chained in a set of transformation steps as well. In this way, datar would give a very important advantage in the modeling process where Python has not a library of this kind that could be chained. This greatly aids the maintainability and reproducibility of the code.

The kind of step_*() functions I am referring to, are documented here:

Thanks in advance, Carlos.

Additional Context

No response

pwwang commented 9 months ago

Looks like this requires a lot of ad-hoc coding. Any packages in python doing this? If so, we can wrap them; otherwise, I may not be available to implement them in the near future.

But of course, PRs are welcome.

coforfe commented 9 months ago

Thanks for your quick answer!.

Yes, this package is the one that has most of the transformations, already built in functions that can be wrapped.

Thanks again, Carlos.