BUG: rbindlist in scaling should use `use_names = T`

krpaulson commented 3 years ago

Describe the bug I think there is an rbindlist somewhere in scale which needs to use use_names=T.

I am trying to scale by sex with a data.table in which some groups do not have sex-specific rows, and I came across this error. See the toy example below.

To Reproduce

dt <- data.table(
  group = c(1,2,2,2),
  sex = c('all','all','male','female'),
  val = c(2,3,1,1)
)

mapping <- data.table(
  child = c("male", "female"),
  parent = "all"
)

dt2 <- hierarchyUtils::scale(
  dt,
  id_cols = c("sex", "group"),
  value_cols = "val",
  col_stem = "sex",
  col_type = "categorical",
  mapping = mapping,
  missing_dt_severity = "none"
)

Expected behavior Expected output would have group 1 untouched and group 2 scaled by sex.

Screenshots

Column 2 ['group'] of item 2 appears in position 1 in item 1. Set use.names=TRUE to match by column name, or use.names=FALSE to ignore column names. use.names='check' (default from v1.12.2) emits this message and proceeds as if use.names=FALSE for  backwards compatibility. See news item 5 in v1.12.2 for options to control this message.

> dt2
    group sex val
1:    all   2 3.0
2: female   2 1.5
3:   male   2 1.5
4:      1 all 2.0

Desktop (please complete the following information):

docker v2020.0.12

Additional context Am I using the missing_dt_severity = "none" option correctly? Also, should we add a test for this scenario?

krpaulson commented 3 years ago

@chacalle I am not sure if this will still be relevant with the significant pending changes you have to hierarchyUtils, but please take a look or add a test for this scenario.

chacalle commented 3 years ago

Yeah this doesn't seem to be an issue after the changes

ihmeuw-demographics / hierarchyUtils

BUG: rbindlist in scaling should use `use_names = T` #50