Closed fzeiser closed 4 years ago
We are permissive in that respect. It allows people who know what they are doing (and what the consequences are) to do things outside the norm.
The documentation for using a Dockerfile
with your repo says that "you are on your own" if things break. There are so many things people can do with a Dockerfile
that it is not possible for us to provide support, we only do that when you use repo2docker
's way of generating the docker file. So in conclusion it feels Ok to tell people (who are already going of the beaten track by using a Dockerfile
) that they should use a tag and if they ignore that advice allowing them to do so. (There are lots of subtle things that happen/don't happen if you use :latest
so I'd strongly recommend to not do it even if we let you :D )
Do you think it would be better for forbid people from doing this? Should/would we have to invent an escape hatch for expert-experts? I am not sure what the right tradeoff is here. So far we have taken an approach similar to Python: we are all consenting adults here. So if you want to juggle flaming chainsaws while riding a wild tiger we will advise against it but let you do it.
I'm ok with the general idea, but could I suggest changing the wording in the documentation then?
For a Dockerfile to work on Binder, it must meet the following requirements: [...] It must explicitly specify a tag in the image you source.
To something that it more like: We strongly encourage instead of must. When I read this, I would have expected that my script should throw an error.
The tricky thing is how do we formulate it so that a reader thinks they aren't allowed/able to not set it, yet allow someone like you to figure out that there won't be an error message if you violate this rule?
The problem is there are many, many people who don't want to think about setting the SHA and use something like latest. This will work at the beginning but then when they update their base image, their binder won't be updated. Which confuses people and generates support questions (and usually people are pretty frustrated when they post those). So overall I feel Ok with telling people a little not-quite-true story.
The other solution could be that we actually enforce this rule.
This will work at the beginning but then when they update their base image, their binder won't be updated.
Just to make sure that I understand this corrently. When they update the base image and one updates the SHA in the "child" repo, the binder will be updated, right? There is no strange cache thing going on, I assume.
I prefer the documentation not to tell me things that are wrong. So I'd suggest: If you don't set a tag, you image may fail to create reproducible results. Because from what I understand, that's what happens/the problem, right? If you don't set a tag, the base image might change.
The other option is, as you say, enforcing the rule, with a sensible error message. But then you might run into those people who you described ("...but then when they update their base image, their binder won't be updated..."); for me this seems less confusing than having a wrong documentation, but seemingly other people find it more confusing.
The close was by accident
I have a question that is somewhat related, not sure whether I should open a new issue for it:
The jupyter-docker-stacks docs state that:
You must refer to git-SHA image tags when stability and reproducibility are important in your work. (e.g. FROM jupyter/scipy-notebook:7c45ec67c8e7, docker run -it --rm jupyter/scipy-notebook:7c45ec67c8e7). You should only use latest when a one-off container instance is acceptable (e.g., you want to briefly try a new library in a notebook).
However, when I refer to jupyter/scipy-notebook:7c45ec67c8e7
, this uses minimal-notebook
as a base image (ARG BASE_CONTAINER=jupyter/minimal-notebook
). Will the version pinning of scipy-notebook
pin the versions of all bases as well? I'd assume so, asking to be sure.
So I assume that when I use jupyter/scipy-notebook:7c45ec67c8e7
as a base, that will work -- but not if one downloads the Dockerfile from that github commit and tries to build the image again. Because the Dockerfile does not specify the BASE's SHA.
For support questions it is better to open a thread on the forum. we try and keep the GitHub issues to technical discussions about changing the contents of the repository. The forum is more accessible and better indexed by search engines. so overall better to have these discussions there.
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:
Bug description
MyBinder documentation says that Docker image require a specific SHA tag. However, I realized that my container runs even without a tag
https://github.com/jupyterhub/binder/blob/ddd801f994c1f1a3828fb991e5f847f895a3c724/doc/tutorials/dockerfile.rst#L92-L117
Expected behaviour
There should be some error message I assume, telling me that I have to provide a SHA tag, potentially linking to the documentation.
Actual behaviour
The repo runs perfectly without, even if I just have this line:
FROM jupyter/minimal-notebook:latest
https://github.com/oslocyclotronlab/ompy/blob/d4f03b64d653d0506a0c88a68df86311a2efbc87/Dockerfile#L7How to reproduce
Binder link: https://mybinder.org/v2/gh/oslocyclotronlab/ompy/master?filepath=ompy%2Fnotebooks%2Fgetting_started.ipynb
from this repo: https://github.com/oslocyclotronlab/ompy/tree/d4f03b64d653d0506a0c88a68df86311a2efbc87