Pre-application enquiry - googleComputeEngineR

MarkEdmondson1234 commented 8 years ago

Hello, I'm in the process of writing an interface with Google Cloud Compute, and have been looking at the docker code from https://github.com/sckott/analogsea that led me to here.

As analogsea isn't included in ropensci perhaps that answers my question already, but I have a lot of docker stuff in/planned https://github.com/cloudyr/googleComputeEngineR that I felt would fall under the reproducibility criteria of rOpenSci.

For example, it has VM templates that pull from a docker container using cloudconfig to create the initial state. These config files can be set up to pull your own docker image on boot. I want to implement it so those docker containers are saved to your private/public google project with your code + data in its frozen state, which can be fired up with a bigger machine on GCE or pulled from another application running docker.

Longer term I'd also like to implement Kubernetes clusters, which allow reproducibility for clusters of docker images, and Dataproc, which is a Spark cluster API.

My main motivation for applying is to get some serious review of the code and help to prioritise features.

Thanks for reading! Mark

sckott commented 8 years ago

hi @MarkEdmondson1234 - It looks like you're asking if the docker stuff would fit in here? If so: probably. One of our own (Carl Boettiger) co-maintains Rocker https://github.com/rocker-org - so it's stuff we care about

@wch did start a package consolidating docker stuff in https://github.com/wch/harbor - though not on CRAN. Maybe it'd be worth collaborating with him on it. I imagine harbor would want to stick to just docker stuff, so Kubernetes would be a sep. package.

MarkEdmondson1234 commented 8 years ago

Thanks @sckott , yes I have had a look through harbor and think I will depend on it if it can be CRANd or at least copy functions.

Great though, I'll keep it in mind when finishing up for a first CRAN release that would suit the criteria. One thought though is that its now part of cloudyr project - do those clash at all? Perhaps a fork could live in each?

sckott commented 8 years ago

@MarkEdmondson1234 Sorry, I meant any code dealing with docker itself, not google compute engine, is what's of interest here.

MarkEdmondson1234 commented 8 years ago

@sckott Ah ok, no prob. I'll see what may spin out of the this then and see if it is applicable, maybe a fork of harbor 👍

sckott commented 8 years ago

Thanks for asking!

ropensci / software-review

Pre-application enquiry - googleComputeEngineR #77