Closed markfairbanks closed 2 years ago
FYI checks are passing now that #393 is merged (so we're definitely good for review).
I don't follow the numeric substitution logic. I don't think I'd expect step_setnames(lazdt, c(4, NA), c('s', 'j'))
to rename the second column toj
. Is it related to the same issue that we'd want that behavior?
Do we even need to change step_setnames
? It seems that pivot_wider()
already attempts to replicate the data.table varnames created by dcast
by creating new_vars
, it's just wrong in this edge case. If we add
new_vars <- vctrs::vec_assign(new_vars, is.na(new_vars), "NA")
right after this:
Does that solve the issue?
I don't follow the numeric substitution logic. I don't think I'd expect step_setnames(lazdt, c(4, NA), c('s', 'j')) to rename the second column toj. Is it related to the same issue that we'd want that behavior?
Hmm now that you mention it it is probably better to just ignore the numeric case in replace_setnames_na()
. Not sure why I thought it was necessary to build in 😅
@eutwt ready for review again.
Do we even need to change step_setnames?
I think it's a good idea to fix this in step_setnames()
so that it is fixed in all cases in the future. If you still think the fix should be in pivot_wider()
let me know, I'm open to discuss more.
In my mind it would be cleaner if the step_setnames
function took valid old
/new
names and just applied them. It feels a little unintuitive that it would be changing your old
/new
names for you, possibly increasing confusion if we somehow have unintentionally missing values in input vectors, see examples below.
But, this is an extremely minor point, up to opinion, and the function is internal anyway. I think this is fine to merge as-is
It feels a little unintuitive that it would be changing your old/new names for you, possibly increasing confusion if we somehow have unintentionally missing values in input vectors
Wouldn't any workaround we build run into this issue? (like the one mentioned in here)
Basically I see this as being due to the new_vars
vector being mis-constructed, not due to some problem in how the new names are applied. Concretely, the example below still errors in this PR, because the NA->"NA" is done in step_setnames()
which isn't used.
lazy_dt(tibble(x = c('a', NA), y = 2)) %>%
pivot_wider(names_from = 'x', values_from = 'y')
Ah yep, that makes sense.
Closes #394
Pretty sure this is failing checks due to things updated in #393. So we can wait to merge until that is ready to go.