nuest / ten-simple-rules-dockerfiles

Ten Simple Rules for Writing Dockerfiles for Reproducible Data Science
https://doi.org/10.1371/journal.pcbi.1008316
Creative Commons Attribution 4.0 International
61 stars 15 forks source link

comment about rule 3: "Format for clarity" #98

Open sdettmer opened 2 years ago

sdettmer commented 2 years ago

comments about rule 3: "Format for clarity"

I think this is a rule for maintenance, not for reproducibility. It could even be good in certain cases if the files (Dockerfile) are automatically generated.

vsoch commented 2 years ago

It's important for a future person (which may be yourself) to be be able to clearly read and understand the recipe.

sdettmer commented 2 years ago

@vsoch Thank you for your quick reply. Yes, it is, but the Dockerfile might be automatically be generated. This for example is very common in web applications, the resulting code is "minified" and hard to read, but since it is automatically generated, it is not the place to look at.

In this case, the recipe is the generator.

I have a lot of Dockerfiles created by scripts and I think it is important that these scripts are easy to read, especially if there are automated tests proving that the used logic works.

vsoch commented 2 years ago

Yes but if the Dockerfile is the only artifact found, I hope it is understandable. If the tool doesn’t make it understandable that just makes life slightly harder for a future reader without context. Is it the end of the world? No. But I also think this comment is more directed toward people that wrote Dockerfiles and not automated generators. I tend to write most of mine, for example, and I always put a comment to show the intended build and run step (minimally) and any important details about my choices of installs in the container.

sdettmer commented 2 years ago

@vsoch Yes, I see, and please see that I full understand that reproducibility is a burden, expensive, and no thankful job at all, in many times it makes no sense to be reproducible for the results in first place (only for other requirements, like being able to validate methods later, forensic, security...). I mean, we see that almost every Dockerfile in open source is not reproducible and it works well (or good enough). Maybe the rule could state something like "format all source code files for clarity", but again, it is for maintainability. So it is a great rule, but in wrong document :)

vsoch commented 2 years ago

lol, sure, perhaps you should write a follow up paper and put all these thoughts down! The paper here is already published, and although not perfect, it's a reasonable guide for someone new to writing Dockerfiles. You can nit pick from here to eternity and I don't see it as being productive (indeed it has used quite a bit of my time today and I'm about ready to drop off).

sdettmer commented 2 years ago

@vsoch Yes, same for me, actually I spent time I didn't had. Thank you for your time. I hope others who read this maybe progress further.