jupyterhub / mybinder.org-user-guide

Turn a Git repo into a collection of interactive notebooks. This is Binder's user documentation repository.
https://mybinder.readthedocs.io
BSD 3-Clause "New" or "Revised" License
151 stars 103 forks source link

MyBinder does not require SHA tag of Docker image #194

Closed fzeiser closed 4 years ago

fzeiser commented 4 years ago

Bug description

MyBinder documentation says that Docker image require a specific SHA tag. However, I realized that my container runs even without a tag

https://github.com/jupyterhub/binder/blob/ddd801f994c1f1a3828fb991e5f847f895a3c724/doc/tutorials/dockerfile.rst#L92-L117

Expected behaviour

There should be some error message I assume, telling me that I have to provide a SHA tag, potentially linking to the documentation.

Actual behaviour

The repo runs perfectly without, even if I just have this line: FROM jupyter/minimal-notebook:latest https://github.com/oslocyclotronlab/ompy/blob/d4f03b64d653d0506a0c88a68df86311a2efbc87/Dockerfile#L7

How to reproduce

Binder link: https://mybinder.org/v2/gh/oslocyclotronlab/ompy/master?filepath=ompy%2Fnotebooks%2Fgetting_started.ipynb

from this repo: https://github.com/oslocyclotronlab/ompy/tree/d4f03b64d653d0506a0c88a68df86311a2efbc87

betatim commented 4 years ago

We are permissive in that respect. It allows people who know what they are doing (and what the consequences are) to do things outside the norm.

The documentation for using a Dockerfile with your repo says that "you are on your own" if things break. There are so many things people can do with a Dockerfile that it is not possible for us to provide support, we only do that when you use repo2docker's way of generating the docker file. So in conclusion it feels Ok to tell people (who are already going of the beaten track by using a Dockerfile) that they should use a tag and if they ignore that advice allowing them to do so. (There are lots of subtle things that happen/don't happen if you use :latest so I'd strongly recommend to not do it even if we let you :D )

Do you think it would be better for forbid people from doing this? Should/would we have to invent an escape hatch for expert-experts? I am not sure what the right tradeoff is here. So far we have taken an approach similar to Python: we are all consenting adults here. So if you want to juggle flaming chainsaws while riding a wild tiger we will advise against it but let you do it.

fzeiser commented 4 years ago

I'm ok with the general idea, but could I suggest changing the wording in the documentation then?

For a Dockerfile to work on Binder, it must meet the following requirements: [...] It must explicitly specify a tag in the image you source.

To something that it more like: We strongly encourage instead of must. When I read this, I would have expected that my script should throw an error.

betatim commented 4 years ago

The tricky thing is how do we formulate it so that a reader thinks they aren't allowed/able to not set it, yet allow someone like you to figure out that there won't be an error message if you violate this rule?

The problem is there are many, many people who don't want to think about setting the SHA and use something like latest. This will work at the beginning but then when they update their base image, their binder won't be updated. Which confuses people and generates support questions (and usually people are pretty frustrated when they post those). So overall I feel Ok with telling people a little not-quite-true story.

The other solution could be that we actually enforce this rule.

fzeiser commented 4 years ago

This will work at the beginning but then when they update their base image, their binder won't be updated.

Just to make sure that I understand this corrently. When they update the base image and one updates the SHA in the "child" repo, the binder will be updated, right? There is no strange cache thing going on, I assume.

I prefer the documentation not to tell me things that are wrong. So I'd suggest: If you don't set a tag, you image may fail to create reproducible results. Because from what I understand, that's what happens/the problem, right? If you don't set a tag, the base image might change.

The other option is, as you say, enforcing the rule, with a sensible error message. But then you might run into those people who you described ("...but then when they update their base image, their binder won't be updated..."); for me this seems less confusing than having a wrong documentation, but seemingly other people find it more confusing.

fzeiser commented 4 years ago

The close was by accident

fzeiser commented 4 years ago

I have a question that is somewhat related, not sure whether I should open a new issue for it:

The jupyter-docker-stacks docs state that:

You must refer to git-SHA image tags when stability and reproducibility are important in your work. (e.g. FROM jupyter/scipy-notebook:7c45ec67c8e7, docker run -it --rm jupyter/scipy-notebook:7c45ec67c8e7). You should only use latest when a one-off container instance is acceptable (e.g., you want to briefly try a new library in a notebook).

However, when I refer to jupyter/scipy-notebook:7c45ec67c8e7, this uses minimal-notebook as a base image (ARG BASE_CONTAINER=jupyter/minimal-notebook). Will the version pinning of scipy-notebook pin the versions of all bases as well? I'd assume so, asking to be sure. So I assume that when I use jupyter/scipy-notebook:7c45ec67c8e7 as a base, that will work -- but not if one downloads the Dockerfile from that github commit and tries to build the image again. Because the Dockerfile does not specify the BASE's SHA.

betatim commented 4 years ago

For support questions it is better to open a thread on the forum. we try and keep the GitHub issues to technical discussions about changing the contents of the repository. The forum is more accessible and better indexed by search engines. so overall better to have these discussions there.

meeseeksmachine commented 4 years ago

This issue has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/how-does-version-pinning-work-for-jupyter-docker-stacks-images/4075/1