rpm-software-management / dnf5

Next-generation RPM package management system
Other
240 stars 76 forks source link

Invalid mirror reported with repoqery --location #1673

Open praiskup opened 2 weeks ago

praiskup commented 2 weeks ago

Note this output:

[root@ip-172-30-2-126 /]# dnf5 repoquery --location bash-0:5.2.32-1.fc41.x86_64
Updating and loading repositories:
 fedora                                                                                                                                                                                                                                                                                                                                                                                                                                             100% | 236.2 KiB/s |  27.2 KiB |  00m00s
Repositories loaded.
https://d2lzkl7pfhq30w.cloudfront.net/pub/fedora/linux/development/rawhide/Everything/x86_64/os/Packages/b/bash-5.2.32-1.fc41.x86_64.rpm

The bash with Release=1 is installed there, so I tried to ask where it comes from. It reports the cloudfront.net URL, which is already providing a Release=2 package, though.

It seems that DNF picked a wrong mirror to load (outdated) metadata, and then assumed that all the packages in the bad metadata are also provided even by (up2date) cloudfront mirror, but no longer they are.

See also: https://pagure.io/fedora-infrastructure/issue/12163

This happens with:

rpm-sequoia-1.7.0-2.fc41.x86_64
rpm-libs-4.19.92-6.fc41.x86_64
rpm-build-libs-4.19.92-6.fc41.x86_64
libsolv-0.7.30-1.fc41.x86_64
libdnf5-5.2.5.0-2.fc41.x86_64
libdnf5-cli-5.2.5.0-2.fc41.x86_64
dnf5-5.2.5.0-2.fc41.x86_64
dnf5-plugins-5.2.5.0-2.fc41.x86_64
rpm-4.19.92-6.fc41.x86_64

But it actually happens with a DNF4 implementation on F40 in the same location (EC2 box, us-east-1).

ppisar commented 2 weeks ago

I don't understand what is a bug you report.

"dnf5 repoquery --location" does not report where a package was installed from. It reports where a package would be downloaded from. DNF does not validates whether the file exists there. It simply picks a file path from cached repository data, appends it to a mirror URL returned in a cached mirror manager response.

Naturally if a content of the mirror has changed in the mean time, then the printed URL will be invalid. DNF cannot know until it tries to request that URL from the server.

What behavior do you expect?

praiskup commented 2 weeks ago

This is not a race condition :)

"dnf5 repoquery --location" does not report where a package was installed from. It reports where a package would be downloaded from.

Yes, except that it would not - and the report was wrong. The URL reported was available on a different mirror, not the one reported.

Naturally if a content of the mirror has changed in the mean time, then the printed URL will be invalid. DNF cannot know until it tries to request that URL from the server.

No content has changed in the meantime. Both before or after asking dnf5, the d2lzkl7pfhq30w.cloudfront.net did not provide the reported URL, it was migrated to a different metadata version long time before. I also did this query:

[root@ip-172-30-2-126 ~]# dnf repoquery --disablerepo '*' --enablerepo=hell --repofrompath=hell,https://d2lzkl7pfhq30w.cloudfront.net/pub/fedora/linux/development/rawhide/Everything/x86_64/os/ -a | grep bash
Added hell repo from https://d2lzkl7pfhq30w.cloudfront.net/pub/fedora/linux/development/rawhide/Everything/x86_64/os/
Last metadata expiration check: 0:00:59 ago on Tue 03 Sep 2024 01:03:30 PM UTC.
argbash-0:2.10.0-15.fc41.noarch
augeas-bash-completion-0:1.14.1-2.fc41.noarch
autorandr-bash-completion-0:1.13.3-5.fc41.noarch
bash-0:5.2.32-2.fc42.x86_64
bash-argsparse-0:1.8-5.fc41.noarch
bash-color-prompt-0:0.5-2.fc41.noarch
bash-completion-1:2.13-2.fc41.noarch
bash-completion-devel-1:2.13-2.fc41.noarch
bash-devel-0:5.2.32-2.fc42.x86_64

At that point in time, the cloudfronts repo was correctly providing newer metadata and DNF did not know that.

What behavior do you expect?

DNF should do some basic validation of mirrors, and provide a valid URL (race conditions are acceptable of course). If one mirror is chosen for reading the metadata, --location results shouldn't be mixed up with mirrors that provide a different version of metadata (newer or older).

praiskup commented 2 weeks ago

One more example:

[root@ip-172-30-2-126 /]# curl https://d2lzkl7pfhq30w.cloudfront.net/pub/fedora/linux/development/rawhide/Everything/x86_64/os/repodata/repomd.xml https://mirror.slu.cz/fedora/linux/development/rawhide/Everything/x86_64/os/repodata/repomd.xml 2>/dev/null | grep revision
  <revision>1725345239</revision>
  <revision>1725258838</revision>

These mirrors are obviously desynced. RPMs provided by the first mirror may not necessarily be part of the second one.

There might come other questions like

But these are orthogonal in this ticket; these problems happen from time to time and DNF should deal with that.

ppisar commented 2 weeks ago

DNF obtains a list of up-to-date mirrors from a mirror manager. If there are outdated mirrors on the list, it's a bug in the mirror manager. DNF relies on the list and exploits it for parallel fetching from multiple mirrors. If a download fails, DNF retries from another mirror.

So yes, "dnf repoquery --location" can return nonexistent document.

I think we could change "dnf repoquery --location" to always use a mirror it downloaded the repository from. The original URL is somewhere cached, I believe.

But I don't believe that DNF should actively recheck that given mirror contains the same revision of the repository. It would be too expensive. Or would you expect "dnf repoquery --location" to do a GET/HEAD request on every invocation?

praiskup commented 2 weeks ago

I think we could change "dnf repoquery --location" to always use a mirror it downloaded the repository from. The original URL is somewhere cached, I believe.

This would sound better! Btw., how are the particular repositories picked (the one to read metadata and the one reported)? If there's no GET/HEAD checking, is there some intentional round-robin mechanism?

Or would you expect "dnf repoquery --location" to do a GET/HEAD request on every invocation?

For particular RPMs? Probably no (even though our use-case would be OK with that). For repomd.html? Maybe. There's something DNF can do for the potential --location consumers.... such a tool would have no info about potential other mirrors (no possibility to implement the fallback).

ppisar commented 2 weeks ago

how are the particular repositories picked (the one to read metadata and the one reported)? If there's no GET/HEAD checking, is there some intentional round-robin mechanism?

If I remember correctly, metadata are fetched from the first item on the list returned by a mirror manager. Mirror manager sorts the list based on location of the client, excluding out-dated mirrors. Mirror manager's reply also contains a current repository revision. DNF then checks that the mirror contains that revision. If it doesn't , next mirror on the list is tried.

Regarding downloading packages, I don't know. I believe that the mirrors are tried in the same order as on the list. I have no idea if there is a kind of round robin employed. Maybe the list is already randomized by the mirror manager. But I really don't know. If you are interested, read librepo sources.

praiskup commented 2 weeks ago

The way you describe the mirroring works, it seems robust. If DNF checks the revision, I'm curious how the problem could appear.

praiskup commented 1 week ago

A similar problem was reported by @kdudka for the epel-7 (yum-utils !) build chroot; occasionally, outdated EPEL 7 mirror is chosen for loading of metadata, and the corresponding RPMs are not available (e.g. wrong epel-rpm-macros-7.23 install attempt failures, while we should install epel-rpm-macros-7.38).