jupyterhub / binderhub

Run your code in the cloud, with technology so advanced, it feels like magic!
https://binderhub.readthedocs.io
BSD 3-Clause "New" or "Revised" License
2.54k stars 388 forks source link

Handle possible collision between branch name and commit hash #299

Open Carreau opened 6 years ago

Carreau commented 6 years ago

Let's say I visit:

https://mybinder.org/v2/gh/wildtreetech/explore-open-data/cabe28b

I'll get the image for cabe28b but I can visit

https://mybinder.org/v2/gh/wildtreetech/explore-open-data/master

And get the latest build.

What if I have a cabe28b branch ? Or tag ? The behavior is likely undefined, or at least non obvious. Having a commit that happen to be an already existing branch name is improbable, and naming a branch after a commit is likely an extrem behavior. I still want to point that out for potential v3 improvement ?

minrk commented 6 years ago

This doesn't seem to be possible (on GitHub, at least). While I can create a local branch with a hash (after ignoring a bunch of warnings), GitHub won't let me push it:

~/dev/mine/junk (b8949e38ab36d82981e1332adc56689810675db3) $ git push
warning: refname 'b8949e38ab36d82981e1332adc56689810675db3' is ambiguous.
Git normally never creates a ref that ends with 40 hex characters
because it will be ignored when you just specify 40-hex. These refs
may be created by mistake. For example,

  git checkout -b $br $(git rev-parse ...)

where "$br" is somehow empty and a 40-hex ref is created. Please
examine these refs and maybe delete them. Turn this message off by
running "git config advice.objectNameWarning false"
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 420 bytes | 420.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
remote: error: GH002: Sorry, branch or tag names consisting of 40 hex characters are not allowed.
remote: error: Invalid branch or tag name "b8949e38ab36d82981e1332adc56689810675db3"
To github.com:minrk/junk.git
 ! [remote rejected] b8949e38ab36d82981e1332adc56689810675db3 -> b8949e38ab36d82981e1332adc56689810675db3 (pre-receive hook declined)
error: failed to push some refs to 'git@github.com:minrk/junk.git'

I don't think it's a problem to have it be unspecified whether a ref is a commit/tag/branch. Git already has this behavior, and we're reflecting that. If we deviate from git, then we have to start requiring people to specify whether the ref is a branch/tag/commit and then additionally validate that this is what we got, which git itself doesn't do because they all work interchangeably.

But if we do discover a case where this might be an issue, adding the ref type (e.g. tag/1.2.3 or branch/master) to the v3 spec may be warranted. GitHub's own URLs have never needed this, so I would be surprised if ours will.

Carreau commented 6 years ago

Fair enough, should we then enforce that commit ref in binder urls are 40 char long ? There might also be a conflict tag/branchnames. As most of the URLs are autogenerated by entering informations in the MyBinder Form, I would be inclined to be explicit, potentially having a .../auto/<tag|hash|branch> that would redirect.

I do not thing this is really an issue for now, but can see how it could be a use case, like if any of these were to allow slashes in branchnames.

minrk commented 6 years ago

Resolving abbreviated commits so that if you type abc123, the copy-able link is actually for abc123456.... seems like a good idea.

mariusvniekerk commented 6 years ago

i think at least for the github / gitlab repoproviders they can already resolve short hashes