nuest / ten-simple-rules-dockerfiles

Ten Simple Rules for Writing Dockerfiles for Reproducible Data Science
https://doi.org/10.1371/journal.pcbi.1008316
Creative Commons Attribution 4.0 International
64 stars 15 forks source link

Platforms docker containers are tested on #27

Closed psychemedia closed 4 years ago

psychemedia commented 4 years ago

https://github.com/nuest/ten-simple-rules-dockerfiles/blob/4a87e3e3ad43feacd98722f1521e500191bb17bb/ten-simple-rules-dockerfiles.Rmd#L78

Tooling is starting to appear that makes it easier to build things for different platforms. For example, Using multi-arch Docker images to support apps on any architecture describes buildx, an experimental service developed by Docker to make it easier to "build, push, pull, and run images seamlessly on different compute architectures".

vsoch commented 4 years ago

What I mean is that for testing (CircleCI or TravisCI or GitHub) you usually create a grid to test on. This is an explicit decision to test on ubuntu 16.04, 18.04, Centos 7, etc., (Python 2? Python 3?) and the point in the comment above is saying that the container creator should explicitly state the environments that have been tested. It's great that Docker makes it easier to build on multiple platforms, but that is a different thing than what the developer chooses to actually build and test on.

psychemedia commented 4 years ago

Re: testing on different o/s versions — yes, but in the context of me building an environment to do research in, why would I do that? I want to run my code in my environment. Why should I care that it runs on a different version of Linux or Python? Unless maybe I'm building a tool / package / application that other people are going to need to run in their own environment which may be py 3.4, 2.5, 3.6 etc.

In which case, then I should also be mindful that maybe they want to run it on an Arm stack? Or something with a particular flavour of GPU?

vsoch commented 4 years ago

If you are an open source developer and maintain software for a community - trust me, you care. Take a look at the testing setups for any major scientific library / container and you'll see a grid that tests debian, centos, and sometimes even windows, minimally.

psychemedia commented 4 years ago

Yes; agreed. But there are other users who are building their stack for their research. This relates in part to who the intended audience of the paper is. Different audiences are likely to have different interests and different concerns. For the guidelines to be useful, I think they need to be relevant to the intersection of several audiences, but also reflect that particular elements of best practice may be more heavily weighted to the concerns of some audiences than others.

vsoch commented 4 years ago

If the paper is intended for Data Scientists / Researchers, then the title needs to clearly indicate that.

nuest commented 4 years ago

@vsoch You original comment seems to have disappeared in the latest draft. Which rule do you see fitting to re-add the idea of documenting the tested platforms?

I think different architectures is out of scope for the article, but we might add it to make readers aware these exist. Opinions?

vsoch commented 4 years ago

If the question is to add extra content, at this point I'd say no.

psychemedia commented 4 years ago

What about s/thing like: "Automation strategies exists to build images from common, parameterised specifications across multiple platforms and / or from different component package versions. Such approaches are often used when developing popular software packages for a broad user base operating across a wide range of target platforms and environments." ?

vsoch commented 4 years ago

That sounds accurate! I'd just simplify it a little bit:

Automation strategies exist to build and test images for multiple platforms and software versions. Such approaches are often used when developing popular software packages for a broad user base operating across a wide range of target platforms and environments.

nuest commented 4 years ago

I'd say we don't include the suggested paragraph, as it points to possibilities outside of the scope of the article.

@vsoch @psychemedia Feel free to re-open if you find a good spot to put this in.

psychemedia commented 4 years ago

I think something like this could fit around https://github.com/nuest/ten-simple-rules-dockerfiles/blob/470d115ee3cce7e645f5b15077217e7cae23129d/ten-simple-rules-dockerfiles.Rmd#L612 ?