kpcyrd / what-the-src

Source code of https://whatsrc.org/
https://whatsrc.org/
GNU General Public License v3.0
27 stars 4 forks source link

Some Arch sources not getting logged #28

Closed alerque closed 1 month ago

alerque commented 1 month ago

We just experienced a tag getting yanked upstream at Electron (again, its and endemic problem over there) and I came by to look for clues. It turns out some sources aren't getting logged, notable the main Electron source (from a tagged Git source) isn't getting picked up, e.g. from electron29. All sorts of other sources are there including other tagged Git commits (tags and pinned commits), but not the star of that particular show.

Any idea what's up?

kpcyrd commented 1 month ago

Copying over my response from the Arch Linux bug tracker:

I checked and there's two reasons:

  • The maximum number of search results is 150, but electron29 29.4.3-1 has 154(!) build inputs
    • I've decided to bump this to 250, the build pipeline is running and I'm going to deploy this in a few minutes
  • The git tag was deleted by upstream before it was imported by whatsrc
what-the-src=# select * from tasks where key like '%git+https://github.com/electron/electron.git%';
    id    |                                key                                 |                                         data                                         | retries |            error             
----------+--------------------------------------------------------------------+--------------------------------------------------------------------------------------+---------+------------------------------
 30905714 | git-clone:git+https://github.com/electron/electron.git#tag=v29.4.3 | {"GitSnapshot": {"url": "git+https://github.com/electron/electron.git#tag=v29.4.3"}} |       6 | Error in git fetch operation
(1 row)

PS: instead of using the search you can directly use the hash from sha256sums= - from the PKGBUILD:

        # END managed sources
        )
sha256sums=('5a4318f8afcf5f55762b72e9dff0c00b14962582c1b51b8721ec6a0e3d01d222'
            '7916b80d801bcc5c23cb9dd1ae820d939af3ef640dbcb2a3c8d6780dcf6ba7a3'
            '8c256b2a9498a63706a6e7a55eadbeb8cc814be66a75e49aec3716c6be450c6c'

which means the link for the first source= entry is:

https://whatsrc.org/artifact/sha256:5a4318f8afcf5f55762b72e9dff0c00b14962582c1b51b8721ec6a0e3d01d222

since the import failed this link is 404 though.

PPS: the search limit is now 250.

If an artifact is missing there's a chance the import failed, and specifically for chromium there's exclusion rules because their tarballs are ridiculously large and I don't want to deal with them as this was causing all kinds of issues. Their tarballs are bigger (both in size and number of files) than Firefox and the Linux kernel combined. I didn't need to add exclusion rules for any other software project.