Probably the most common operation in R is subsetting, ie. [ and $. While trying to use ddR and looking at ops.R I notice that the subsetting operators all collect.
How would one return a distributed object from subsetting? I.e. remove 10% of the rows of a dframe that contain NA. The resulting dframe will still be large, so it's best to keep it distributed.
One idea is to have DObjects closed under subsetting and leave it to the user call collect() explicitly.
Probably the most common operation in R is subsetting, ie.
[
and$
. While trying to useddR
and looking atops.R
I notice that the subsetting operators allcollect
.How would one return a distributed object from subsetting? I.e. remove 10% of the rows of a
dframe
that containNA
. The resultingdframe
will still be large, so it's best to keep it distributed.One idea is to have
DObject
s closed under subsetting and leave it to the user callcollect()
explicitly.