go-gitea / gitea

Git with a cup of tea! Painless self-hosted all-in-one software development service, including Git hosting, code review, team collaboration, package registry and CI/CD
https://gitea.com
MIT License
44.15k stars 5.41k forks source link

RPM Registry: primary.xml <location> should not url encode caret (^) characters #32021

Closed nephatrine closed 1 day ago

nephatrine commented 1 week ago

Description

When uploading a file with a name like hello-test-0.0.1^15.git7166d1f2-1.el9.x86_64.rpm to the RPM registry, I am able to see the file correctly in the Gitea packages UI and can manually download the RPM from there, but DNF cannot download it on an actual system. DNF does see that the package exists and tries to download it, but gets 404 errors:

Downloading Packages:
[MIRROR] hello-test-0.0.1%5E15.git7166d1f2-1.el9.x86_64.rpm: Status code: 404 for https://example.com/api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm (IP: 136.56.234.199)
[MIRROR] hello-test-0.0.1%5E15.git7166d1f2-1.el9.x86_64.rpm: Status code: 404 for https://example.com/api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm (IP: 136.56.234.199)
[FAILED] hello-test-0.0.1%5E15.git7166d1f2-1.el9.x86_64.rpm: No more mirrors to try - All mirrors were already tried without success

In my gitea logs, I can see those 404 attempts.

172.17.0.1 - - [10/Sep/2024:09:40:55 -0400] "GET /api/packages/testuser/rpm/almalinux/el9/repodata/repomd.xml HTTP/1.0" 200 1244 "" "libdnf (AlmaLinux 9.4; generic; Linux.x86_64)"
172.17.0.1 - - [10/Sep/2024:09:40:57 -0400] "GET /api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm HTTP/1.0" 404 22 "" "libdnf (AlmaLinux 9.4; generic; Linux.x86_64)"
172.17.0.1 - - [10/Sep/2024:09:40:57 -0400] "GET /api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm HTTP/1.0" 404 22 "" "libdnf (AlmaLinux 9.4; generic; Linux.x86_64)"

If I try to put one of those URLs into my web browser like https://example.com/api/packages/testuser/rpm/almalinux/el9/package/hello-test/0.0.1%255E15.git7166d1f2-1.el9/x86_64/hello-test-0.0.1%255E15.git7166d1f2-1.el9.x86_64.rpm, I indeed get a message package does not exist.

If, however, I change those instances of %255E in the URL to %5E, the URL does work so it seems the caret is being url encoded twice. Looking in the repodata/primary.xml.gz that gitea produces, I see that the location field it produces has the caret already encoded to %5E, but in major RPM repositories like EPEL (https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/repodata/) this is not the case and the location does not have carets already url encoded. In my own testing producing a version of the gitea repository files that does not have the caret pre-urlencoded in the field works and allows packages to be downloaded by the package manager and seems to match the behaviour of other RPM repositories.

I have reproduced the issue on the demo site here: https://demo.gitea.com/nephatrine/-/packages/rpm/hello-test/0.0.1%5E15.git7166d1f2-1.el9

Gitea Version

1.22.2

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

No response

Git Version

2.45.2

Operating System

Alpine 3.20

How are you running Gitea?

I build Gitea myself and run it from my own docker container.

Database

SQLite

nephatrine commented 1 week ago

This is the Fedora documentation on the usage of the caret in the package version (https://docs.fedoraproject.org/en-US/packaging-guidelines/Versioning/#_handling_non_sorting_versions_with_tilde_dot_and_caret) just to state that this is a real use case. There are packages on EPEL8 and EPEL9 that contain such characters as well.

I tested that the issue occurs using the standard DNF package manager on AlmaLinux 8 and AlmaLinux 9. Presumably actual RHEL and Rocky Linux would 8/9 would have the same issues as they're all using DNF.

As RHEL7 does not support the caret operator in the version to begin with, I did not test Centos 7 or anything else that old as its a moot point.

It looks like openSUSE has different versioning guidelines on post-versions so a caret theoretically wouldn't appear in packages intended for it, but I did test on an openSUSE system and Zypper has the same encoding behaviour as DNF and so it 404s trying to use a URL containing %255E instead of just %5E. It works fine if the has the unencoded ^.

It might be that there's some other bizarre RPM-based package manager that both can have a caret appear in the package version and requires it to be url-encoded in the field, but I am not aware of any. I do not think correcting this would break any extant RPM package manager and it brings Gitea in line with how other RPM repos appear to function.

KN4CK3R commented 5 days ago

I encode the names in the link: https://github.com/go-gitea/gitea/blob/f05d9c98c4cb95e3a8a71bf3e2f8f4529e09f96f/services/packages/rpm/repository.go#L440-L442

I could not find infos if this field should be url safe. It's rather unusual to not escape the url. But it looks like the clients parse the url and encode the parts (again)?!

wxiaoguang commented 5 days ago

It seems that the field "location" is just a relative path, not a URL, so it doesn't (shouldn't) need to be encoded.

See the official example:

image

wxiaoguang commented 5 days ago

One more example, no encoding for the "location" relative path:

image