RevolutionAnalytics / dplyr-spark

spark backend for dplyr
48 stars 18 forks source link

improve behavior across multiple session #17

Open piccolbo opened 9 years ago

piccolbo commented 9 years ago

What happens right now all tbls become stale, src becomes stale and one needs to refresh by hand. Possible steps:

> z = data.frame(col1, col2)
> quit()
save? N
$ R
> #z is gone
> z = data.frame(col1, col2)
> quit()
save? Y
$ R
> print(z)
> #z is still there
> rm(z) #z gone

There's a third possibility where you can save z to a file and load it later. Now in an ideal world we can go through the same use case with a tbl_SparkSQL and never have z in some butchered state where it exists but is invalid. This may be a pipe-dream because I don't think save is a generic so that we can do something with it, but I thought I'd write it down.