Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.62k stars 986 forks source link

Allow type override in := when i is just order() #2925

Open MichaelChirico opened 6 years ago

MichaelChirico commented 6 years ago
DT = data.table(a = 1:10)
DT[order(-a), a := paste0(a)]

This errors saying we should be explicit about type overrides, but:

DT[ , a := paste0(a)]

This code works fine (and is of course identical, but the same problem arises when the column update is order-dependent). Presumably this is because when i is non-empty, := suspects there could be a type collision if assigning a new type to a subset of rows.

But order queries in i are incapable of subsetting, so I would expect the behavior to be identical.

MichaelChirico commented 3 years ago

I got bit by this again.

Here seems like perfectly reasonable code to me:

DT=data.table(a = rep(letters[1:3], 3), b = 1:9)

Now, try to convert a to factor, with levels generated in reverse-b order:

DT[order(-b), a := factor(a, levels = unique(a))]

Actually, nothing is done -- because a is a character, and we're trying to assign factor->character, [ "smartly" coerces factor->character, resulting in no change at all (PS this is not at all clear from the verbose output, which could be improved -- I was totally at sea until I tried an example where a starts out as integer, then the := fails because it tries to assign integer->character, an error).

As a workaround I'm basically forced to create a placeholder variable like a_factor or something like that

MichaelChirico commented 2 months ago

IINM one workaround is to use sort_by() for now:

DT[, a := factor(a, levels = rev(unique(sort_by(a, b))))] # rev for -b ordering