Open erl-hpe opened 1 week ago
It's hard for me to reproduce this, since it's a private repository. It is probably a regression from the new v2 support.
Can you try passing protocol_version=0 to clone and seeing if that improves things?
That's odd. It is listed as a public repository.
Ah, I've managed now - not sure what went wrong last time, unrelated issue with my SSH key.
Cool. FYI - I was able to replicate it from another GitHub account using Ubuntu as well as from my MacBook.
That said, I have trouble reproducing the bug:
% PYTHONPATH=. python3
Python 3.12.7 (main, Oct 3 2024, 15:15:22) [GCC 14.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from dulwich import __version__
>>> __version__
(0, 22, 3)
>>> from dulwich.porcelain import clone
>>> url='git@github.com:Cray-HPE/vtds-configs.git'
>>> target='foo'
>>> clone(url, target)
Enumerating objects: 207, done.
Counting objects: 100% (207/207), done.
Compressing objects: 100% (78/78), done.
Total 207 (delta 94), reused 204 (delta 93), pack-reused 0 (from 0)
copied 206 pack entries06/207
<Repo at 'foo'>x: 189/207
>>> rating index: 206/207
% cd foo
../foo% git branch -l
* main
Huh. That's weird. I get the same behavior I described from two different systems using two different GitHub users. Setting protocol_version=0 does "fix" the behavior, though.
What is funny about the output you pasted is that I see:
Total 207 (delta 94), reused 204 (delta 93), pack-reused 0 (from 0)
copied 206 pack entries06/207
when it is going to fail and
Total 248 (delta 109), reused 247 (delta 109), pack-reused 0 (from 0)
copied 247 pack entries47/248
when it is going to work, but you saw it work with the former output.
I have seen it with python 3.12.3
and python 3.11.6
if that helps at all. I haven't tried 3.12.7.
Is there anything else you would like me to look at, since I can get it to happen?
I tried it from a cloned version of dulwich too (checking out the dulwich-0.22.3 tag and doing a pip install .
from there) and got the same behavior.
I doubt this is significant, but on both systems I am working in a venv, in case that sheds any light.
It's probably not relevant, but do you have the rust extensions compiled?
The other thing you could try is see if things still fail at commit 436ce9f94137e46d527d8260fe79242e1c82e00b, which was just before the v2 improvements landed.
I don't believe I have the rust extensions. If I do, it would be because of something I don't know about. I hate to say "no" absolutely without checking, any idea how I would check that? I will check 436ce9f for you.
Not seeing the problem at 436ce9f.
If there's files starting with _ in your dulwich package directory, you've got the Rust extensions.
A big help would be if you could bisect to find the problematic commit, since I don't have a way to reproduce this. There shouldn't be too many in between 0.22.1 and 436ce9f
Except for the normal python files (init.py, pychache and the like) and some C files and their related .so files, I don't see anything interesting starting with _, so I think, no rust extensions.
I will try to track down the specific commit later this morning.
Looks like it broke between 325de7d3 and 2400933b. I am thinking that there is something not working in the lower level implementation of the v2 protocol because the diffs between those commits are not very interesting in themselves.
I am curious about why you can't reproduce it. I have managed to get it reliably with two very different GitHub users on two very different systems. I am going to see if someone else who is not me (physically) can reproduce it. I will report back shortly.
So, just to see whether my GitHub repo is somehow corrupt or weird, I turned on packet tracing and version 2 protocol with my local git clone
command. I don't see any references to master
in there:
Cloning into 'foo'...
13:25:50.768043 pkt-line.c:86 packet: clone< version 2
13:25:50.769334 pkt-line.c:86 packet: clone< agent=git/github-d37f7b990c25
13:25:50.769375 pkt-line.c:86 packet: clone< ls-refs=unborn
13:25:50.769397 pkt-line.c:86 packet: clone< fetch=shallow wait-for-done filter
13:25:50.769418 pkt-line.c:86 packet: clone< server-option
13:25:50.769437 pkt-line.c:86 packet: clone< object-format=sha1
13:25:50.769454 pkt-line.c:86 packet: clone< 0000
13:25:50.769472 pkt-line.c:86 packet: clone> command=ls-refs
13:25:50.769506 pkt-line.c:86 packet: clone> agent=git/2.46.1
13:25:50.769529 pkt-line.c:86 packet: clone> object-format=sha1
13:25:50.769546 pkt-line.c:86 packet: clone> 0001
13:25:50.769564 pkt-line.c:86 packet: clone> peel
13:25:50.769581 pkt-line.c:86 packet: clone> symrefs
13:25:50.769599 pkt-line.c:86 packet: clone> unborn
13:25:50.769617 pkt-line.c:86 packet: clone> ref-prefix HEAD
13:25:50.769636 pkt-line.c:86 packet: clone> ref-prefix refs/heads/
13:25:50.769657 pkt-line.c:86 packet: clone> ref-prefix refs/tags/
13:25:50.769675 pkt-line.c:86 packet: clone> 0000
13:25:50.924390 pkt-line.c:86 packet: clone< 7e26270b75cadd0d38b17847af98c275858c5800 HEAD symref-target:refs/heads/main
13:25:50.924506 pkt-line.c:86 packet: clone< 9f42e0005c18e0de8de9aa72248252b74c693de4 refs/heads/VSHA-599
13:25:50.924557 pkt-line.c:86 packet: clone< 9f42e0005c18e0de8de9aa72248252b74c693de4 refs/heads/VSHA-622
13:25:50.924586 pkt-line.c:86 packet: clone< 848b1c382e1866ff09994156ebbd5ad117a987b0 refs/heads/another-test-branch
13:25:50.924613 pkt-line.c:86 packet: clone< 7e26270b75cadd0d38b17847af98c275858c5800 refs/heads/main
13:25:50.924639 pkt-line.c:86 packet: clone< ec8c69499125b56de957619a651383e7982acd71 refs/heads/test-branch
13:25:50.924666 pkt-line.c:86 packet: clone< 716b5f7702e0846286a3c45bbbb342cc7b6cfa88 refs/tags/0.0.0 peeled:5b168b4d6421f3d8b248753900b516441f708b9c
13:25:50.924701 pkt-line.c:86 packet: clone< 5b168b4d6421f3d8b248753900b516441f708b9c refs/tags/0.0.1
13:25:50.924721 pkt-line.c:86 packet: clone< 0000
want 7e26270b75cadd0d38b17847af98c275858c5800 (HEAD)
want 9f42e0005c18e0de8de9aa72248252b74c693de4 (refs/heads/VSHA-599)
want 9f42e0005c18e0de8de9aa72248252b74c693de4 (refs/heads/VSHA-622)
want 848b1c382e1866ff09994156ebbd5ad117a987b0 (refs/heads/another-test-branch)
want 7e26270b75cadd0d38b17847af98c275858c5800 (refs/heads/main)
want ec8c69499125b56de957619a651383e7982acd71 (refs/heads/test-branch)
want 716b5f7702e0846286a3c45bbbb342cc7b6cfa88 (refs/tags/0.0.0)
want 5b168b4d6421f3d8b248753900b516441f708b9c (refs/tags/0.0.1)
13:25:50.933744 pkt-line.c:86 packet: clone> command=fetch
13:25:50.933774 pkt-line.c:86 packet: clone> agent=git/2.46.1
13:25:50.933784 pkt-line.c:86 packet: clone> object-format=sha1
13:25:50.933792 pkt-line.c:86 packet: clone> 0001
13:25:50.933801 pkt-line.c:86 packet: clone> thin-pack
13:25:50.933808 pkt-line.c:86 packet: clone> no-progress
13:25:50.933816 pkt-line.c:86 packet: clone> ofs-delta
13:25:50.933830 pkt-line.c:86 packet: clone> want 7e26270b75cadd0d38b17847af98c275858c5800
13:25:50.933851 pkt-line.c:86 packet: clone> want 9f42e0005c18e0de8de9aa72248252b74c693de4
13:25:50.933859 pkt-line.c:86 packet: clone> want 9f42e0005c18e0de8de9aa72248252b74c693de4
13:25:50.933870 pkt-line.c:86 packet: clone> want 848b1c382e1866ff09994156ebbd5ad117a987b0
13:25:50.933879 pkt-line.c:86 packet: clone> want 7e26270b75cadd0d38b17847af98c275858c5800
13:25:50.933913 pkt-line.c:86 packet: clone> want ec8c69499125b56de957619a651383e7982acd71
13:25:50.933925 pkt-line.c:86 packet: clone> want 716b5f7702e0846286a3c45bbbb342cc7b6cfa88
13:25:50.933933 pkt-line.c:86 packet: clone> want 5b168b4d6421f3d8b248753900b516441f708b9c
13:25:50.933945 pkt-line.c:86 packet: clone> done
13:25:50.933952 pkt-line.c:86 packet: clone> 0000
13:25:51.119064 pkt-line.c:86 packet: clone< packfile
13:25:51.258701 pkt-line.c:86 packet: sideband< PACK ...
13:25:51.258911 pkt-line.c:86 packet: sideband< 0000
and the command retrieved the repo the way I was expecting it to (i.e. with a main
branch not a master
branch). I don't see a way to turn on packet tracing with the porcelain clone
operation, or I would do that and see what I get for comparison.
There are two things I am seeing here that seem interesting. You seem to be able to reproduce one of them but not the other:
git command
(any version) or the version 1 (i.e. version=0) porcelain clone. You seem to get this same result.master
branch instead of a main
branch, and I don't see anywhere in the repo data where a master
branch would be coming from -- it seems to come out of thin air. You don't appear to be getting that result.It might be useful to focus on the disparity in objects and pack entries, since you are able to reproduce that (check whether you see 248/247 running with version=0
, I expect you will). I expect that the issue I am seeing is related to / caused by whatever causes that.
I am getting a Raspberry Pi I have lying around set up to try this too so I can see if the problem exists there. That is taking some time because I have to update Python on it. I will update when I have results there.
I freshly installed Python 3.12 on my Pi, cloned dulwich
and checked out the dulwich-0.22.3
branch then ran my test and got the same result there. That was with my non-HPE github user (like what I used on Ubuntu) and nothing special in my environment. So far, I have not seen an environment in which this doesn't happen. I am going to stop looking for that.
I think I know what's happening; with v2 there are two separate fetches of capabilities, and the "symref" capability is only extracted from the first one, while it appears in the second.
(The symref capability provides information as to where HEAD will point, with dulwich defaulting to "master" if none is provided by the server)
@erl-hpe Would you be able to double check that this fixes the issue for you as well? I'm a bit hesitant to declare it fixed since I had trouble reproducing it earlier.
I just tried it on my Pi, and now I am seeing no branches at al (i.e. no main
and no master
).
(venv) pi@softrock:~/dulwich $ git log
commit b1287d36edf7f54d38a0ed93021f2dc84f6db027 (HEAD -> master, origin/master, origin/HEAD)
Merge: 15d6c817 f1075d25
Author: Jelmer Vernooij <jelmer@jelmer.uk>
Date: Sun Oct 20 13:11:57 2024 +0100
Fix handling of symrefs with protocol v2. Fixes #1389 (#1392)
commit f1075d25f3a1ab7011f77290dfc3dbd3fb3f29c6
Author: Jelmer Vernooij <jelmer@jelmer.uk>
Date: Sun Oct 20 02:08:56 2024 +0100
Fix handling of symrefs with protocol v2. Fixes #1389
commit c6abf72bc691981e6468dd8b8e60151d6cdeb9df
Author: Jelmer Vernooij <jelmer@jelmer.uk>
Date: Sun Oct 20 00:34:12 2024 +0000
Factor out extraction of symrefs
commit 15d6c817bc5e2628a9a8eb3d8c4326f1bd86eb24
Merge: cd30df4e 7b881b38
Author: Jelmer Vernooij <jelmer@jelmer.uk>
..... <SNIP> ....
(venv) pi@softrock:~/dulwich $ pip install .
Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple
Processing /home/pi/dulwich
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: urllib3>=1.25 in /home/pi/venv/lib/python3.12/site-packages (from dulwich==0.22.3) (2.2.3)
Building wheels for collected packages: dulwich
Building wheel for dulwich (pyproject.toml) ... done
Created wheel for dulwich: filename=dulwich-0.22.3-cp312-cp312-linux_armv7l.whl size=262055 sha256=338069918e97e8c931d5fc8f58274e06dde382da054205350cb46a1d75254fd1
Stored in directory: /tmp/pip-ephem-wheel-cache-a302jopd/wheels/11/aa/07/b3f3284398e3f6df0f82c9b0189035fbf66bbec7d43ca54c0b
Successfully built dulwich
Installing collected packages: dulwich
Attempting uninstall: dulwich
Found existing installation: dulwich 0.22.3
Uninstalling dulwich-0.22.3:
Successfully uninstalled dulwich-0.22.3
Successfully installed dulwich-0.22.3
(venv) pi@softrock:~/dulwich $ cat do_clone.py
from dulwich.porcelain import clone
url='git@github.com:Cray-HPE/vtds-configs.git'
target='foo'
clone(url, target)
(venv) pi@softrock:~/dulwich $ python do_clone.py
Enumerating objects: 207, done.
Counting objects: 100% (207/207), done.
Compressing objects: 100% (78/78), done.
Total 207 (delta 94), reused 204 (delta 93), pack-reused 0 (from 0)
copied 206 pack entries06/207
(venv) pi@softrock:~/dulwich $ cd foo
(venv) pi@softrock:~/dulwich/foo $ git branch -l
(venv) pi@softrock:~/dulwich/foo $
You're right of course. It looks like the tests don't actually check any of this for the v2 protocol :-/
Ah, maybe I was only seeing this because I was using the https protocol rather than git+ssh when trying reproduce it - I have URL rewrite rules in ~/.git/config.
So I'm a little bit further and have it narrowed it down to the code that populates refs after the fetch operation.
Hi, I was just going to take a look at this. It seems you already found and fixed it in b1287d36edf7f54d38a0ed93021f2dc84f6db027? Anything else that needs to be done?
Ah. Yes, the git+ssh protocol thing sounds like the likely difference. Glad you were finally able to repro it. I was beginning to wonder if I had lost my mind (not a foregone conclusion). Based on what I am seeing this morning, with a fresh update of my clone of dulwich, I don't think this is quite fixed yet. It still doesn't show any branches or remote refs using the git+SSH protocol on v2. Could be you are still looking into it? No worries if you are, just want to make sure you aren't thinking it is fixed (I see a lot of commits that suggest maybe you think that).
Using dulwich.porcelain clone() at the 0.22.3 release level to clone
I wind up with refs that contain a single branch named 'master'. Unfortunately, the remote repository no 'master' branch, but it does have a 'main' branch. In other words, 'master' in the clone should be 'main' but is not. When I do the same thing with 0.22.1 I get what I expect.
Here is a transcript of the (correct) 0.22.1 behavior:
and the list of branches in the resulting repo:
Here is what I see with 0.22.3:
and the branches: