This came up last week during the tidymodels workshop and Max suggest that I open an issue.
Sometimes a dataset contains a mix of qualitative character variables and dummy encoded variables. If we need to homogenize the data, a step_ function for this may be useful. Something that uses tidyselect for var selection and takes the name of the feature being described.
For example, going from this:
species
arboreal
terrestrial
sp a
0
1
sp b
1
0
sp c
1
0
to this:
species
locomotion
sp a
terrestrial
sp b
arboreal
sp c
arboreal
There are many ways to implement this, I have a silly write up here but a base approach would be better.
if this already exists and I missed it because of unfamiliarity with ML terms please disregard
Feature
This came up last week during the tidymodels workshop and Max suggest that I open an issue.
Sometimes a dataset contains a mix of qualitative character variables and dummy encoded variables. If we need to homogenize the data, a
step_
function for this may be useful. Something that uses tidyselect for var selection and takes the name of the feature being described.For example, going from this:
to this:
There are many ways to implement this, I have a silly write up here but a base approach would be better.