pachadotdev / analogsea

Digital Ocean R client
https://pacha.dev/analogsea/
Apache License 2.0
154 stars 24 forks source link

Default to docker #47

Closed hadley closed 9 years ago

hadley commented 9 years ago

This workflow looks pretty compelling: http://www.magesblog.com/2014/09/running-rstudio-via-docker-in-cloud.html

sckott commented 9 years ago

Yeah, looks great. Do yo mind if we combine with the other docker issue = #18 See also @cboettig comment https://github.com/sckott/analogsea/issues/17#issuecomment-53317455 - Or do you think it's a separate item?

hadley commented 9 years ago

I made the title a little stronger to separate it from those other issues. I think the advantages of docker over ssh'ing text files are clear, and hence using docker in analogsea should be the path of least resistance (you can always opt-out and use droplet_ssh()).

I think that implies there should be a parallel set of functions like:

(where docklet is shorthand for a droplet with docker installed)

Then all the install functions would go away.

cboettig commented 9 years ago

Yup, agree that Docker would be the sensible default.

@hadley Sorry if it wasn't clear from the issues that Scott linked, but that's very close to what I already added in https://github.com/sckott/analogsea/blob/master/R/rstudio.R. I was just matching the syntax of Scott's existing function, but I see the advantage of breaking it into the steps you show there.

Quick editorial note: the rstudio image would be eddelbuettel/debian-rstudio, though I have also written an eddelbuettel/ubuntu-rstudio depending on what flavor of linux one prefers (of course the differences are small). As part of the rocker project we also now have a few images that extend the rstudio image further, particularly since install.packages won't always work on the linux machines if system dependencies are needed. I took the liberty of calling one of these "hadleyverse" to convey it's intent to package your ecosystem of packages -- @hadley your feedback would be most welcome (and let me know if you'd prefer it called something else -- shoulda checked with you first).

I wonder if it would be better to have a separate R package to interface with docker, and then build upon those functions in analogsea?

Thoughts?

One more minor note: I've had good luck running RStudio on the tiny 512 MB DigitalOcean instances after increasing swap space, so it may not be necessary to use the 1 gb droplet as a default.

hadley commented 9 years ago

@cboettig ah ok, that makes sense. I want a very thin wrapper for working with docker on digital ocean - it's so simple (~40 lines) that I don't think it needs to be a in separate package. I'll push something up shortly - then rstudio() function can use those functions to simplify things a little.

I agree that it would be useful to have a separate package to work with docker, but I think abstracting over the communication mechanism is going to be tricky. It definitely would be nice to use docker push to cache results, rather than the slower droplet snapshotting mechanism. Ideally you'd be able to call that inside the remote Rstudio instance.

It might also be worth thinking a little about how a single droplet could support multiple RStudio projects. Would each project run a separate docker with a completely separate Rstudio? Would you have one container for Rstudio, which would then talk to containers for each project?

(PS calling the image hadleyverse is fine ;)

cboettig commented 9 years ago

@hadley cool. Yup, I imagine a thin wrapper as well, just thinking it would be useful in other contexts as well (okay, and then i got carried away thinking of other things I would add to such a docker package, like the Hub API.) Forging ahead in analogsea sounds good; and I might play around with a docker R package at some stage.

Great question on the one-droplet:multiple RStudio instances. I can launch multiple RStudio containers on different ports on the same droplet, but for some reason when I log in to a new instance, it logs me out of the old instance (the instance keeps running, just logs out). Feels like that shouldn't happen -- not sure if that has to do with the browser, RStudio, or Docker, but was hoping someone at RStudio might help me debug that.

As far a RStudio projects are concerned, it seems to me like a single RStudio instance already does a good job of handling multiple projects(?) I'm not sure what would be gained by separating the projects into different containers. (the only real use case I've been able to exploit in linking containers is connecting a relational database server container to an R container)

hadley commented 9 years ago

@cboettig I think rstudio() would now just be:

docklet_rstudio <- function(droplet, usr='rstudio', pwd='rstudio', 
                            email='rstudio@example.com', img='cboettig/rstudio', 
                            port='8787', browse = TRUE, verbose = TRUE) {
  droplet <- as.droplet(droplet)

  docklet_pull(d, img)
  docklet_run(d,
    " -p", port, "8787", 
    " -e USER=", user,
    " -e PASSWORD=", password,
    " -e EMAIL=", email,
    img) 

  url <- sprintf("http://%s:%s/", droplet_ip(droplet), port)
  if (browse) {
    browseURL(url)
  }

  invisible(url)
}
cboettig commented 9 years ago

Nice!


Carl Boettiger http://carlboettiger.info

sent from mobile device; my apologies for any terseness or typos On Oct 1, 2014 12:03 PM, "Hadley Wickham" notifications@github.com wrote:

@cboettig https://github.com/cboettig I think rstudio() would now just be:

docklet_rstudio <- function(droplet, usr='rstudio', pwd='rstudio', email='rstudio@example.com', img='cboettig/rstudio', port='8787', browse = TRUE, verbose = TRUE) { droplet <- as.droplet(droplet)

docklet_pull(d, img) docklet_run(d, " -p", port, "8787", " -e USER=", user, " -e PASSWORD=", password, " -e EMAIL=", email, img)

url <- sprintf("http://%s:%s/", droplet_ip(droplet), port) if (browse) { browseURL(url) }

invisible(url)}

— Reply to this email directly or view it on GitHub https://github.com/sckott/analogsea/issues/47#issuecomment-57517884.

hadley commented 9 years ago

@cboettig would it be ok to remove the old rstudio() function?

cboettig commented 9 years ago

certainly! was just a proof of principle, don't think anything is using it. anything you and scott do is fine by me ;-)

On Wed, Oct 1, 2014 at 12:30 PM, Hadley Wickham notifications@github.com wrote:

@cboettig https://github.com/cboettig would it be ok to remove the old rstudio() function?

— Reply to this email directly or view it on GitHub https://github.com/sckott/analogsea/issues/47#issuecomment-57521717.

Carl Boettiger UC Santa Cruz http://carlboettiger.info/