harrelfe / Hmisc

Harrell Miscellaneous
Other
205 stars 81 forks source link

implemented formatfun functionality in `cut2` #98

Closed moodymudskipper closed 5 years ago

moodymudskipper commented 5 years ago

Following my conversation with Pr. Harrell by email

Z <- 1000*stats::rnorm(10000)

# initial behavior unchanged
table(cut2(Z, g=4))
# 
# [-3754.07,-677.77) [ -677.77,  -1.72) [   -1.72, 684.06) [  684.06,3805.43] 
#               2500               2500               2500               2500 

# use ...
table(cut2(Z, g=4, trim = TRUE))
# 
# [-3754.07,-677.77)    [-677.77,-1.72)     [-1.72,684.06)   [684.06,3805.43] 
#               2500               2500               2500               2500 

# change formatting function
table(cut2(Z, g=4, formatfun = formatC))
# 
# [-3.75e+03,-678)     [-678,-1.72)     [-1.72, 684)  [ 684,3.81e+03] 
#             2500             2500             2500             2500 

# formatfun AND ...
table(cut2(Z, g=4, formatfun = signif, digits=1))
# 
# [-4000,-700)    [-700,-2)     [-2,700)   [700,4000] 
#         2500         2500         2500         2500 

# custom formatting function
table(cut2(Z, g=4, formatfun = function(x) paste0("$",round(x))))
# 
# [$-3754,$-678)    [$-678,$-2)     [$-2,$684)   [$684,$3805] 
#           2500           2500           2500           2500

I included support for the formula notation if rlangis installed, it fails explicitly if users try to use formula notation and and rlangis not installed. I can easily remove it.

table(cut2(Z, g=4, formatfun = ~paste0("$",signif(.))))
# 
# [$-3754,$-678)    [$-678,$-2)     [$-2,$684)   [$684,$3805] 
#           2500           2500           2500           2500

My initial reason for this was to be able to use my function format_metric :

devtools::install_github("moodymudskipper/cutr")
table(cut2(Z, g=4, formatfun = cutr::format_metric)
# [-3.75 k,-678)   [-678,-1.72)    [-1.72,684)   [684,3.81 k]
#           2500           2500           2500           2500

My package uses the argument format_fun in its functions but I named it formatfun here to be more consistent with other argument names.

It was quite straightforward to do, the main challenge is that the "digits" argument is taken, and used through options througout the function, to be able to use it in all formatting functions I used :

  format.args <- 
    if (any(c("...","digits") %in%  names(formals(args(formatfun))))) {
    c(digits = digits, list(...))
  } else {
    list(...)
  }

( names(formals(args(formatfun))))) rather than formalArgs(formatfun) to handle primitives.

and then replaced the relevant calls to format, .e.g : flow <- format(low) becomes flow <- do.call(formatfun, c(list(low), format.args))