Closed MarkEdmondson1234 closed 8 years ago
hi @MarkEdmondson1234 - It looks like you're asking if the docker stuff would fit in here? If so: probably. One of our own (Carl Boettiger) co-maintains Rocker https://github.com/rocker-org - so it's stuff we care about
@wch did start a package consolidating docker stuff in https://github.com/wch/harbor - though not on CRAN. Maybe it'd be worth collaborating with him on it. I imagine harbor
would want to stick to just docker stuff, so Kubernetes would be a sep. package.
Thanks @sckott , yes I have had a look through harbor
and think I will depend on it if it can be CRANd or at least copy functions.
Great though, I'll keep it in mind when finishing up for a first CRAN release that would suit the criteria. One thought though is that its now part of cloudyr project - do those clash at all? Perhaps a fork could live in each?
@MarkEdmondson1234 Sorry, I meant any code dealing with docker itself, not google compute engine, is what's of interest here.
@sckott Ah ok, no prob. I'll see what may spin out of the this then and see if it is applicable, maybe a fork of harbor
👍
Thanks for asking!
Hello, I'm in the process of writing an interface with Google Cloud Compute, and have been looking at the docker code from https://github.com/sckott/analogsea that led me to here.
As
analogsea
isn't included inropensci
perhaps that answers my question already, but I have a lot of docker stuff in/planned https://github.com/cloudyr/googleComputeEngineR that I felt would fall under the reproducibility criteria of rOpenSci.For example, it has VM templates that pull from a docker container using cloudconfig to create the initial state. These config files can be set up to pull your own docker image on boot. I want to implement it so those docker containers are saved to your private/public google project with your code + data in its frozen state, which can be fired up with a bigger machine on GCE or pulled from another application running docker.
Longer term I'd also like to implement Kubernetes clusters, which allow reproducibility for clusters of docker images, and Dataproc, which is a Spark cluster API.
My main motivation for applying is to get some serious review of the code and help to prioritise features.
Thanks for reading! Mark