coreos / rpm-ostree

⚛📦 Hybrid image/package system with atomic upgrades and package layering
https://coreos.github.io/rpm-ostree
Other
856 stars 193 forks source link

Repo priority ignored/inconsistent with dnf #5031

Open mtalexan opened 1 month ago

mtalexan commented 1 month ago

Describe the bug

When multiple *.repo files are provided and some incldue the priority= field, the priority value when processing packages: from the treefile during rpm-ostree compose tree is either completely ignored or just inconsistent with how dnf specifies and handles it.

Reproduction steps

  1. Create your own RPM repo
  2. Pick a package with multiple versions offered from the standard public repos that is older than the ones available (e.g. kernel from update-archives that's no longer present in update, but without including update-archives in your treefile) and add it to your custom RPM repo
  3. Create a *.repo file for your custom RPM repo with priority=1 (the highest)
  4. Test and verify your *.repo file for the custom RPM repo works when running a dnf provides looking for the package you added to it.
  5. Get a working treefile configuration and ensure it builds successfully
  6. Make sure any lockfiles are removed/cleaned
  7. Add the custom *.repo file to it, and the matching string to the repos: key in the treefile
  8. Add the package name to the packages: key of the treefile for the RPM you have in your custom repo
  9. Do the rpm-ostree compose tree --downloadonly and check the version of the package that was selected/downloaded
  10. Modify the treefile to explicitly select the exact version of the RPM in your custom repo
  11. Make sure any lockfiles are removed/cleaned
  12. Do the rpm-ostree compose tree --downloadonly and check the version of the package that was selected/downloaded

Expected behavior

The version of the RPM present in the custom RPM repo is found in steps 9 and 12

Actual behavior

The version of the RPM found in step 9 is the latest found across all repos, and not the version found in the highest priority repo.
Step 12 may or may not work, it seems to be related to some arbitrary sorting/searching order of the repo that's unaffected by the priority= specification of the repos.

System details

Additional information

Per the dnf documenation and implementation, priority= is an optional field with lower numbers being greater priority. A package search effectively searches all the RPM repos, but only pays attention to the returned list(s) from the greatest priority (lowest priority number) repo(s) that have any results, ignoring all other repo results. If multiple results are returned from the greatest priority repo(s), the "latest"/"best" match is selected from that returned list only.

For example, if two repos have priority 1, and one repo has priority 90, and a search for kernel is performed, all three repos are queried. If the latest version available on either of the priority 1 repos is 6.9.4, but the priority 90 repo has version 6.9.11, the 6.9.4 version from the priority 1 repos will be found to be the latests/best because they returned any results matching the search meaning the results from the priority 90 repo are completely ignored.
Similarly, if a version query search is performed, that version query search occurs on each of the repos. If only the priority 90 repo has results matching the version query, it doesn't matter whether or not the higher priority repos have packages matching the provider name, they had no results matching the version query limitations.

mtalexan commented 1 month ago

A specific example:

The public NVIDIA CUDA repo only lists the latest 2 versions of their nvidia-driver and nvidia-driver-cuda packages in the repomd.xml, even though they retain older versions as well.

If you follow their instructions for adding the CUDA repo you'll find yourself directed here for F39 https://developer.download.nvidia.com/compute/cuda/repos/fedora39/x86_64/

If you add the recommended repo file for F39 to a vanilla Fedora instance, and query for the nvidia-driver-cuda providers:

$ dnf provides --repo=cuda-fedora39-x86_64 'nvidia-driver-cuda'
cuda-fedora39-x86_64                                                                                                                                              284 kB/s | 366 kB     00:01    
Last metadata expiration check: 0:00:01 ago on Mon 29 Jul 2024 04:28:35 PM EDT.
nvidia-driver-cuda-3:555.42.02-1.fc39.x86_64 : CUDA integration for nvidia-driver
Repo        : nvidia-cuda
Matched from:
Provide    : nvidia-driver-cuda = 3:555.42.02-1.fc39

nvidia-driver-cuda-3:555.42.06-1.fc39.x86_64 : CUDA integration for nvidia-driver
Repo        : nvidia-cuda
Matched from:
Provide    : nvidia-driver-cuda = 3:555.42.06-1.fc39

If you look at the actual folder the repo points to, you'll see there's also some other older versions available like nvidia-driver-cuda-550.54.14-1.fc39.x86_64.rpm. If you manually download this older RPM and it's dependencys, and create a local repo to contain them, then include a *.repo file for your local repo in your rpm-ostree treefile config, you should be able to find that older version by specifying the exact name or version of it in your treefile:

packages:
  - nvidia-driver-cuda-550.54.14-1.fc39

or

packages:
  - "'nvidia-driver-cuda = 3:550.54.14'"

I'm finding that you not only can't find that older RPM from the local repo, but even if you set the priority of the local repo to be greater in the .repo file than the public repos, you still can't get it. To double check, I even copied that same local repo .repo file into my vanilla Fedora instance that already has the cuda *.repo file, and confirmed that a dnf provides 'nvidia-driver-cuda = 3:550.54.14' finds it from my local repo successfully.

cgwalters commented 1 week ago

There may be bugs here indeed. My recommendation though is to use repo-packages which I think has a nicer UX and is definitely tested and more widely used.

(Also, this is something that ideally we drive into dnf...once it gains some sort of actual declarative package input format)