Open ttrodrigz opened 1 year ago
Are you sure you are not able to pass what you need to quantile()
in step_discretize()
via the options
argument?
Out of curiosity @EmilHvitfeldt, is there an options argument that would do this? I could not find such and would love to get integer output!
Feature
This idea came to mind after posting here on the Posit Community page which was turned into PR #1075 .
I believe it would be useful to have a
step_quantile()
recipe step. I realize thatstep_discretize()
already exists, but it would be nice to have some additional control via the optional arguments available instats::quantile()
such as thetype
argument.This function would work very similarly to
step_percentile()
, the main difference being that after the breakpoints are created withquantile()
, the data would then be passed along tocut()
rather than creating the percentiles withapprox()
, and the result would be an integer value.I have a working example below which even incorporates @EmilHvitfeldt's implementation of the
outside
argument.Few notes on default settings chosen for the function:
cut()
,include.lowest
is set toFALSE
by default. I believe this is sub-optimal for this recipe step because this setting would result in the minimum value in the data receiving a value ofNA
. This is why in the code below I setinclude.lowest = TRUE
by default.labels = FALSE
to make sure an integer is returned, this overrides any user inputna.rm = TRUE
forquantile_options
Thanks for taking the time to read this!
Reprex
Created on 2023-01-06 by the reprex package (v2.0.1)