jupyterhub / nbgitpuller

Jupyter server extension to sync a git repository one-way to a local path
https://nbgitpuller.readthedocs.io
BSD 3-Clause "New" or "Revised" License
208 stars 85 forks source link

ValueError: Problem accessing HEAD branch #206

Open fmaussion opened 3 years ago

fmaussion commented 3 years ago

Hi! After updating my images today, nbgitpuller started to throw this error when starting from an url which worked previously.

Here is the url (generated with the generator) and the error message:

https://mybinder.org/v2/gh/OGGM/binder/stable?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252FOGGM%252Ftutorials%26urlpath%3Dlab%252Ftree%252Ftutorials%252F%26branch%3Dstable

Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/nbgitpuller/pull.py", line 120, in resolve_default_branch
    head_branch = subprocess.run(
File "/srv/conda/envs/notebook/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'ls-remote', '--symref', '--', 'https://github.com/OGGM/tutorials', 'HEAD']' returned non-zero exit status 128.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/nbgitpuller/handlers.py", line 76, in get
    gp = GitPuller(repo, repo_dir, branch=branch, depth=depth, parent=self.settings['nbapp'])
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/nbgitpuller/pull.py", line 76, in __init__
    self.branch_name = self.resolve_default_branch()
File "/srv/conda/envs/notebook/lib/python3.9/site-packages/nbgitpuller/pull.py", line 136, in resolve_default_branch
    raise ValueError(m)
ValueError: Problem accessing HEAD branch: https://github.com/OGGM/tutorials
fmaussion commented 3 years ago

After looking into this a bit more, it turns out that the machines we now receive from mybinder ask for a password when one runs the command:

git ls-remote --symref -- https://github.com/OGGM/tutorials HEAD

Which is not the case when I run the command locally. Is this some change on the github side, or some git configuration issue?

consideRatio commented 3 years ago

Does it make a difference if you do...

git ls-remote --symref https://github.com/OGGM/tutorials HEAD

instead?

The upgrade you got took you from 0.10.1 to 0.10.2 I assume, and the only difference is https://github.com/jupyterhub/nbgitpuller/commit/f25d3f2685035c11bd668d48e71caf4fc245ba68 related to these -- in between that was added for security reasons.

fmaussion commented 3 years ago

Thanks @consideRatio

No, it also asks a password. But now I wonder if the problem isn't related to an update to nbgitpuller, but rather to an update in the git config or linux images used by mybinder?

fmaussion commented 3 years ago

To reproduce, open a terminal in https://mybinder.org/v2/gh/OGGM/binder/stable and run the above commands. Here in screenrecording:

https://user-images.githubusercontent.com/10050469/131307279-9e5425e8-f55e-4a4b-8c75-7e54db21763a.mp4

consideRatio commented 3 years ago

@fmaussion ah you were asked for credentials no matter what it seems, that is good because then that change about -- seems unrelated which I hoped it was - because we can't just revert that security fix.

I observe that I could also run git ls-remote --symref -- https://github.com/OGGM/tutorials HEAD locally when having git configured with me as an author etc. I'm not sure at all why this happens yet but I think you have pinpointed a clear reproducible situation to look into! Nice work!

consideRatio commented 3 years ago

After updating my images today,

Do you know what version of nbgitpuller you had before?

fmaussion commented 3 years ago

Do you know what version of nbgitpuller you had before?

I was using a tagged image from 2021 05 16

You can still open it on binder with:

https://mybinder.org/v2/gh/OGGM/binder/5959f23f1cdcff95384bbe0897b8290150969b2f

In this image, the git ls-remote --symref -- https://github.com/OGGM/tutorials HEAD returns properly. (i.e. nothing to do with gitpuller).

TimoRoth commented 3 years ago

Immediate observation from my side: This only happens if you're currently in ~, which is a git repo in that new image. If I cd to / before running the ls-remote command(or rm -rf .git), it runs just fine without prompting for a password.

~ being a git repo is also the case in the old image, but for some reason it does not bother git(which is the same version in both images) there.

fmaussion commented 3 years ago

Thanks @TimoRoth - an immediate fix for us would be to delete ~/.git in repo2docker's PostBuild script, but maybe this is something nbgitpuller needs to address still?

TimoRoth commented 3 years ago

Ok, found what's causing it: The new image has

[http "https://github.com/"]
        extraheader = AUTHORIZATION: basic <censored token>

In .git/config. So while in that git repo, git will pass that info along. And it's apparently invalid, so github asks you to fix your auth, resulting in git presenting a login prompt.

fmaussion commented 3 years ago

Thanks - excellent. Is this something repo2docker should fix? Ping @yuvipanda just in case.

consideRatio commented 3 years ago

@TimoRoth wait what image has that and why? This is potentially sensitive credentials that shouldn't be embedded in an image, not by binderhub, mybinder.org, or an end user letting a repo be built with mybinder.org

manics commented 3 years ago

~Maybe it's due to the switch to building the Dockerfile ourselves?~ ~https://github.com/jupyterhub/repo2docker/blob/6e2a6af959a366ea9cd5e268450122d0f7064afd/Dockerfile#L7~

Edit: Ignore this, it's the container used for running the build not the output. In any case find / -name .git returns nothing.

fmaussion commented 3 years ago

@consideRatio for the context:

I don't know if this is relevant, but we actually use CI to build the images in https://github.com/OGGM/r2d, and then pull them from dockerhub in https://github.com/OGGM/binder

manics commented 3 years ago

@fmaussion If you pass https://github.com/OGGM/r2d to mybinder.org (instead of https://github.com/OGGM/binder) and set nbgitpuller to pull do you still have a problem?

fmaussion commented 3 years ago

@manics no I don't - this therefore has to do with the tokens being for the r2d repository and then the use of the dockerfile trick in the reproducible repository: https://github.com/OGGM/binder/blob/master/binder/Dockerfile

I'm not sure anymore if this is something you need to address, as we are misusing repo2docker a little bit here (note that this used to work a few months ago still).

Context for our reasons of having two repositories: https://discourse.jupyter.org/t/reproducible-binder-environments-with-repo2docker-dockerhub-and-nbgitpuller

manics commented 3 years ago

I understand what you're doing, it's a good solution and it's not misusing repo2docker. In case you weren't aware there's a GitHub action that supports your pre-build use-case: https://github.com/jupyterhub/repo2docker-action

I asked about passing https://github.com/OGGM/r2d to mybinder.org to help narrow down the problem as there are several components (mybinder, repo2docker, nbgitpuller, your notebook repo, your docker environment repo, your CI build).

note that this used to work a few months ago still

Your GitHub workflow was only added 2 days ago (https://github.com/OGGM/r2d/commits/master/.github), how were you building the image before? If you go back to the old process does everything work again?

fmaussion commented 3 years ago

Your GitHub workflow was only added 2 days ago

Oh right.

how were you building the image before?

With Travis-CI! @TimoRoth can comment more on the rest. I didn't think that this change would make a difference :see_no_evil:

fmaussion commented 3 years ago

I am making changes in the repos to delete the .git folder in postBuild.

For future reference, here are the permanent mybinder builds to check what happened: