nuest / ten-simple-rules-dockerfiles

Ten Simple Rules for Writing Dockerfiles for Reproducible Data Science
https://doi.org/10.1371/journal.pcbi.1008316
Creative Commons Attribution 4.0 International
64 stars 15 forks source link

Reuse and reproducibility in other build environments #26

Closed psychemedia closed 4 years ago

psychemedia commented 4 years ago

https://github.com/nuest/ten-simple-rules-dockerfiles/blob/4a87e3e3ad43feacd98722f1521e500191bb17bb/ten-simple-rules-dockerfiles.Rmd#L70

Docker / Dockerfiles provide a formalised, text based recipe for building a Docker image. One of the features of other build/provisioning systems like Puppet or Ansible (vagrant to a lesser extent) are the community contributed packages/modules for performing particular tasks, cf. package maintainers in R or Python. I'm not sure to what extent those communities have guidelines for producing community packages, or how it is policed?

With docker, there is the Docker Hub, where containers are shared for the running thereof, sometimes with Dockerfile, sometimes not. The ability to share parts of Dockerfiles that install a particular package, or recipes for performing particular tasks, are perhaps not quite so well supported?

vsoch commented 4 years ago

Actually there is an ONBUILD parameter that sort of serves this purpose. It would allow for distributing a container via a registry (as you mention, Docker Hub) that performs a specific task like building. I think it's out of scope for a container technology / registry to explicitly state best practices for a particular language (e.g., Python or R for this group) but provide the commands (ENTRYPOINT, RUN, ONBUILD) that allow the code / software maintainers to do this.

nuest commented 4 years ago

Puppet or Ansible are out of scope, and partial Dockerfiles too IMO.

I think ONBUILD is out of scope too, but I added a sentence on image stacks.