go-gitea / gitea

Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD
https://gitea.com
MIT License
44.16k stars 5.41k forks source link

Debian Package Upload Issues #32037

Open nephatrine opened 5 days ago

nephatrine commented 5 days ago

Description

I've started trying to publish my deb packages in Gitea's package registry for Ubuntu and Debian using curl as documented in the Gitea package registry docs. The curl command takes a very long time to complete (~3minutes for a 3kb file) and Gitea is totally unresponsive while it is running. That's a pretty big issue for me in itself, but I could deal with it if that was the only issue. After the command completes, the .deb file does appear to have been successfully uploaded. I see it under Packages in the web ui and can download it manually from there.

When I follow the instructions on adding the repo to an actual system to pull down the packages with apt though, I get this error:

E: The repository 'https://code.nephatrine.net/api/packages/nephatrine/debian noble Release' does not have a Release file.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.

In my gitea access logs I see:

172.17.0.1 - - [12/Sep/2024:17:12:33 -0400] "GET /api/packages/nephatrine/debian/dists/noble/InRelease HTTP/1.0" 404 27 "" "Debian APT-HTTP/1.3 (2.7.14)" 172.17.0.1 - - [12/Sep/2024:17:12:33 -0400] "GET /api/packages/nephatrine/debian/dists/noble/Release HTTP/1.0" 404 27 "" "Debian APT-HTTP/1.3 (2.7.14)"

If I try to navigate to that path with a web browser, it tells me package file does not exist so it does seem that the Release files aren't being properly generated?

On the demo site, I uploaded a couple of debian packages and did not run into this issue and http://demo.gitea.com/api/packages/nephatrine/debian/dists/noble/Release is accessible and works. So this seems specific to my setup in some way.

On my own site, I've been using the Alpine, RPM, and Container registries successfully - it's just Debian that seems to have major issues for me. I am struggling to determine what is causing the failure. Just from the enormous amount of time that curl command takes and the whole application becoming unresponsive, I figure Gitea is trying to do some sort of processing of the uploaded file to generate the Release/InRelease and other repository files and whatnot but that step is failing.

I do see lines like this in the logs:

2024/09/12 13:52:44 ...ges/debian/debian.go:27:apiError() [E] context canceled

I can tell by the number of Slow SQL Query that my server is kinda on the weak side it seems, but I would expect things should still generally work even if hobbled a little. Is there some sort of runtime requirement for the debian repositories to work that maybe I am missing?

Gitea Version

1.22.2

Can you reproduce the bug on the Gitea demo site?

No

Log Gist

https://gist.github.com/nephatrine/6e18a5cf2a05e006572e55b223cc9ee3

Screenshots

No response

Git Version

2.45.2

Operating System

Alpine 3.20

How are you running Gitea?

I build Gitea myself and run it from my own docker container on an unRAID server. I'm the only actual user on it.

Database

SQLite

nephatrine commented 5 days ago

I set the logging to Trace so I could get a better insight and this time it created the Release/InRelease files without issue. I updated my gist to include that too.

They're still taking 3-4 minutes to complete and the curl command doesn't complete until that finishes. I have to curl it directly to the container's IP because if I go through my reverse proxy I get a 504 gateway timed out before the command finishes.

2024/09/13 22:27:29 ...s/process/manager.go:188:Add() [T] Start 66e4f491: PUT: /api/packages/nephatrine/debian/pool/bookworm/main/upload (request)
2024/09/13 22:27:29 ...packages/packages.go:123:createPackageAndVersion() [T] Creating package: 1, 1, debian, libhello-test, 0.0.1+15.git7166d1f2, map[], map[], true
2024/09/13 22:27:29 ...packages/packages.go:263:addFileToPackageVersionUnchecked() [T] Adding package file: 37627, libhello-test_0.0.1+15.git7166d1f2_amd64.deb
...
2024/09/13 22:31:01 ...packages/packages.go:263:addFileToPackageVersionUnchecked() [T] Adding package file: 37602, Release
2024/09/13 22:31:01 ...packages/packages.go:263:addFileToPackageVersionUnchecked() [T] Adding package file: 37602, Release.gpg
2024/09/13 22:31:01 ...packages/packages.go:263:addFileToPackageVersionUnchecked() [T] Adding package file: 37602, InRelease
2024/09/13 22:31:01 ...s/process/manager.go:231:remove() [T] Done 66e4f491: PUT: /api/packages/nephatrine/debian/pool/bookworm/main/upload

I'm going to try again with my actions workflow and see if it works with that. It could be that maybe publishing a whole bunch of different packages for different distributions and architectures in quick succession is just too much for my poor server to handle.

(I do think it is still ludicrous how long it's taking to generate the Release/InRelease/etc. files is in my setup though. Is that just a cpu and/or sqlite issue? Is there a way to have it wait to do that Release/InRelease processing until a scheduled task or something runs because if I upload like 10 packages I have to wait 3-4 extra minutes for each one, tying up the system for a lot longer than it really needs to be.)

KN4CK3R commented 4 days ago

Your GetRunnerByUUID queries are really slow. This table should only contain a handful entries... [Slow SQL Query] SELECTid,uuid,name,version,owner_id,repo_id,description,base,repo_range,token_hash,token_salt,last_online,last_active,agent_labels,created,updated,deletedFROMaction_runnerWHERE (uuid=?) AND (deleted=? ORdeletedIS NULL) LIMIT 1 [4a26520e-bf24-4b5a-9661-a341858f0a26 0] - 3m40.858735127s

Are the package files stored on the local filesystem or remote via min.io? Some virus scanner checking file access?

nephatrine commented 4 days ago

The action_runner table has 35 entries. Only 5 runners are actually set up currently, but it looks like it keeps the information for all the old ones I've removed. Still, that's not many entries. The runners seem to fetch and process their tasks very snappily.

The only time I've ever noticed a performance issue is when I go to the Packages tab on my user or organization page. Loading the packages page is sluggish. The packages are stored locally, though they are on slower HDDs instead of SSD just because the container images end up taking up a lot of space. I've got quite a lot of container images stored and many of those are quite large. Wish I could store the container stuff separately from all the other package types tbh.

But still I don't have any problems uploading other package types - just Debian. Is uploading a DEB significantly different than uploading an RPM? I assume both would need to rebuild their specific package index/metadata in a similar way, but uploading an RPM doesn't hang curl for 3 minutes.

lunny commented 4 days ago

The database disk might have an issue, as it’s performing too slowly for such a small table.

nephatrine commented 3 days ago

So performance of the action_runner table is impeding the upload of debian (and only debian) packages? I'm not going to claim that this is the most performant server ever - it's an old PC repurposed to run a couple of servers in my home that mostly just builds docker container images at this point. I don't need it to act as if it's running on some big performant actual server hardware. But something is clearly not right if uploading the same number and size of RPMs takes just over a minute and uploading the DEB versions both takes 46 minutes and locks up gitea so hard it stops responding to other requests for nearly that entire duration. Even uploading multi-gigabyte container images takes less time than a couple of absolutely miniscule DEB packages and I can keep using Gitea for other things concurrently during that.

I'd just like to understand what is different about the debian registry. Is it expected to have significantly higher server requirements to host? If it's simply a case of my server not beefy enough and the above situation is expected in such instances, then so be it. Just the situation doesn't read as that to me.

btw I do know that asking people to help troubleshoot an issue you are clearly not experiencing yourself and maybe can't reproduce at all for free is asking a whole lot so please excuse me if I am coming off as sounding entitled. just frustrated by the prospect of having to continue to run my own artifact repository alongside gitea on this server just for debian lol. I just don't even know where to start trying to troubleshoot myself since I'm not really sure what gitea is doing with the debs that it wouldn't also have to be doing for the rpms.

KN4CK3R commented 2 days ago

There is nothing special about the deb registry. Performance should be similar to RPM. How many deb packages contains your database? Not many if you have such long upload times I think?

nephatrine commented 2 days ago

I am still in the testing phase trying to make sure this works before sending real production packages to Gitea, so it's just a set of test hello world packages right now. The debian package does build into multiple packages though (bin/lib/doc/dbg, etc.) and across multiple architectures and distributions so there's probably about 44 actual .deb files there though they're split between 7 packages as Gitea sees it. Even if I delete every single one of them and upload just a single deb though I still have the same issues.

It's not at all uncommon for a single repository to build into 3-7 split .deb files multiplied by 4-5 architectures for the architecture-dependent packages at least, and them multiply that for at least three different debian-style distros and each individual .deb tying up the system for no less than 3 whole minutes means that publishing the full suite of debs for just one program or library that we want to build and use can take down the entire website (well, the gitea subdomain at least) for a considerable amount of time.

I'm telling you, it locks up the entirety of Gitea. It's as if something is blocking that should be awaiting/asynchronous or similar isn't which might be why it's not noticeable on a very fast system. I mean, Gitea won't even serve html while a debian package is uploading for me (or more correctly during the time between the actual .deb being received and the Release/InRelease file being generated) - the site can't even be browsed or used at a basic level. And this does not happen for alpine, rpm, or container packages which at times are themselves sluggish accessing through the user/organization Packages page for me, but have no perceptible effect on the general operation of the website or other tasks being performed.

I have been doing some digging trying to determine potential causes of slowness on my side and have opened a separate issue for what I think is the culprit for my general performance issues (https://github.com/go-gitea/gitea/issues/32053). I am hoping that if I can resolve that, then it will both solve the sluggishness accessing my user/org Packages page and whatever the heck the Debian packages are doing, but I think there is a real/genuine issue with how Debian packages are being handled based on my experience above that clearly the other package types do not have and correcting my general performance issues wrt packages might clear my symptoms, but not actually get at the actual issue.