Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.62k stars 985 forks source link

When 'by' is missing, and j contains .SD, it needn't be copied, I think. #1735

Open arunsrinivasan opened 8 years ago

arunsrinivasan commented 8 years ago
require(data.table)
dt = data.table(x=1:5, y=6:10, z=11:15)
dt[, names(dt) := lapply(.SD, function(x) x/mean(x))]

Here, .SD == dt, but deep copied. And following that, all the columns from RHS are once again deep copied (/ duplicated). I think we can avoid this by shallow copying .SD in those scenarios.

Can't really think of a scenario when this'd pose a problem..

MichaelChirico commented 7 months ago

I believe this is already being done, namely here:

https://github.com/Rdatatable/data.table/blob/a2213177283f0f15823e1ff823c1fdf63746da3d/R/data.table.R#L1307

But I'm not sure how to confirm or how to write a regression test guaranteeing this, any ideas?