mlr-org / mlr3pipelines

Dataflow Programming for Machine Learning in R
https://mlr3pipelines.mlr-org.com/
GNU Lesser General Public License v3.0
140 stars 25 forks source link

Themis package functions #790

Closed mb706 closed 1 month ago

mb706 commented 3 months ago

what other methods exist in the themis package that we could use?

advieser commented 2 months ago

themis implements differrent methods for handling unbalanced data. This is a summary based on the manual:

Furthermore, the package implements a recipe for one external function:

Lastly, the package implements a recipe (step_upsampling) without a separate implementation, which will replicate rows of a data set to make the occurrence of levels in a specific factor level equal. Only intended for training. Implemented as PipeOpClassBalancing.

Of course, implementing these in mlr3pipelines wouldn't be necessary if pipelines was generally interoperable with tidymodels (https://github.com/mlr-org/mlr3pipelines/issues/490). https://github.com/mlr-org/mlr3pipelines/issues/490 refers to interoperabilitiy in the sense that a pipeline could be used as a step in a tidymodels recipe.

advieser commented 2 months ago

Of these, the following are also implemented in smotefamily:

and additionally:

mb706 commented 1 month ago

We have the themis content itself already, so I will close this for now.