eli-schwartz / aurpublish

PKGBUILD management framework for the Arch User Repository
GNU General Public License v2.0
246 stars 18 forks source link

Subtree history including irrelevant commit #28

Open alerque opened 11 months ago

alerque commented 11 months ago

Somehow for the last few weeks (since about the time https://gitlab.archlinux.org/archlinux/devtools/-/commit/be5f54c95cbbcf46598e23aa456075cbb26806c0 would have landed in devtools, which seems to be similar but orthagonal to the AUR hosting or aurpublish tooling) AUR I've been having trouble with most (but strangely not quite all) of the packages in my aurpublish managed repository to the AUR. Most of them fail to push being blocked by the hook. They seem to be trying to push the initial commit to my repo which is just the readme, nothing relevant to the subtrees for each package.

alerque commented 11 months ago

I've been playing around with git subtree manually and on the remote end of a push I'm getting nearly the whole Git history of with all of the other packages even though the working directory ends up being just the exported package. I think something must have changed in Git itself.

eli-schwartz commented 9 months ago

When a git subtree gets pushed, there's really two things to look at. Under the hood:

At the remote end, the remote receives:

The main thing that matters is the requested ref updates. There really should be only one, that being the printed sha1 -> master. The objects collection that gets transferred is down to however git negotiates the set of objects that matches some rough intersection of criteria such as fast, cheap to compute, and small network impact. This negotiation is different depending on whether you push via the filesystem (copying a file is really cheap) or at the other end of an ssh/https connection (size is important).

What this means is that running it manually may show you a lot of objects available in the pushed location, but the actual refs "shouldn't" change or have anything extra.

The AUR server-side receiving hook is based on a pygit2 commit tree walker that is installed as a githooks(5) "update" hook. The only information it receives is the ref name, the old and new commit sha1s, and the walker processes the new commit sha1 and verifies that every ancestor fulfills the criteria (e.g. has PKGBUILD, .SRCINFO, latter is well formed) and administrative stuff like denying force pushes.

So what it comes down to is this:

Most of them fail to push being blocked by the hook. They seem to be trying to push the initial commit to my repo which is just the readme, nothing relevant to the subtrees for each package.

Why is git subtree split emitting a subtree history that contains the initial commit not present in subdirectories? Can you show an error log for one of these failures? The hook is supposed to tell you details about which precise commit sha1 it didn't like (and git tells you the sha1 that it tried to push). I took a quick look and couldn't reproduce the issue with random packages in your pkgbuilds tree, but I may be missing something.