Rdatatable / data.table

R's data.table package extends data.frame:
http://r-datatable.com
Mozilla Public License 2.0
3.62k stars 985 forks source link

Make `copy()` a shallow copy #5965

Open markfairbanks opened 8 months ago

markfairbanks commented 8 months ago

Per this comment https://github.com/Rdatatable/data.table/issues/5129#issuecomment-926181331 it looks like it was on the to-do list at some point. As far as I could see there wasn't an open issue referencing it.

Just to reply on inplace in point 3 and the comments about copying being expensive. I've been working on making copy() be a shallow copy. Then it won't be expensive. There was talk of exporting shallow() but I always wanted to make copy() be shallow and just have copy(). That way, existing usage of copy() becomes faster too. After a copy(), the first := on a column would need to copy that column. It's a bit tricky for multiple reasons but that's the direction I'm working on at the moment.

TysonStanley commented 8 months ago

I think this would be ideal but don't know the implications of any breaking changes for it.

D3SL commented 8 months ago

Would this break using copy() inside a function so you can perform operations by reference without affecting the global environment, or is shallow copying smart enough to spot that?

markfairbanks commented 8 months ago

The idea is that it wouldn’t break existing uses of copy(). It would just make it so that in the background it only copies columns when necessary.

jangorecki commented 8 months ago

It is possible only if we will move to reference counting, which means depending on R 4.0(?)