openSUSE / libsolv

Library for solving packages and reading repositories
http://en.opensuse.org/openSUSE:Libzypp_satsolver
Other
509 stars 151 forks source link

addfileprovides_queue: return queues containing all added provides #531

Closed kontura closed 11 months ago

kontura commented 1 year ago

Change addfileprovides_queue() to return queues containing all added provides.

(They can be stored in repository meta section REPOSITORY_ADDEDFILEPROVIDES like usual: https://github.com/openSUSE/libsolv/blob/master/examples/solv/fileprovides.c#L73)

Before this change it returned file dependencies and storing those can lead to problems:

  1. first load metadata without filelists
  2. call pool_addfileprovides_queue(...)
  3. and store queues in REPOSITORY_ADDEDFILEPROVIDES
  4. then on the next run filelists are loaded but during adding fileprovides none are added because stored file dependecies match required file dependencies.

This can be solved by returning added file provides instead. This also matches with the REPOSITORY_ADDEDFILEPROVIDES name.

This is changing API. I am not sure if it is a good solution to the problem.

mlschroe commented 11 months ago

I don't really understand the problem you're trying to solv. The REPOSITORY_ADDEDFILEPROVIDES list should contain all the file list provides that have been added to the package provides. I.e. they need to include all the old entries.

I don't understand why dnf5 repoclosure does not work correctly in the dnf5 issue. If a file is in the addedfileprovides list of a repository, it has been added to the provides of the packages that contain it and thus should not be reported as missing. So the old on-disk solv file seems to be in an inconsistent state.

(Btw you shouldn't need to mess with the considered map in make_provides_ready() as you set the POOL_FLAG_WHATPROVIDESWITHDISABLED flag. Just saying...)

kontura commented 11 months ago

I don't really understand the problem you're trying to solv. The REPOSITORY_ADDEDFILEPROVIDES list should contain all the file list provides that have been added to the package provides. I.e. they need to include all the old entries.

It is quite possible we have somehow messed up the solvfile which is causing the issues. However the way I understand the problem and what I am trying to solve is this: First we load just primary.xml, for example:

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://linux.duke.edu/metadata/common" xmlns:rpm="http://linux.duke.edu/metadata/rpm" packages="4">
<package type="rpm">
  <name>agua</name>
  <arch>x86_64</arch>
  <version epoch="0" ver="1.0" rel="1.fc29"/>
  <checksum type="sha256" pkgid="YES">ade558f47b9c4ca4de9a53d90c48a84ef6c6007908e86bfb71f2e576b055ddb9</checksum>
  <summary>Made up package</summary>
  <description>planta description</description>
  <packager></packager>
  <url>None</url>
  <time file="1690201478" build="1690201478"/>
  <size package="6627" installed="0" archive="252"/>
  <location href="x86_64/agua-1.0-1.fc29.x86_64.rpm"/>
  <format>
    <rpm:license>GPLv3+</rpm:license>
    <rpm:vendor></rpm:vendor>
    <rpm:group>Unspecified</rpm:group>
    <rpm:buildhost>7e6d7c9478f2</rpm:buildhost>
    <rpm:sourcerpm>agua-1.0-1.fc29.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="4504" end="6525"/>
    <rpm:provides>
      <rpm:entry name="agua" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
      <rpm:entry name="agua(x86-64)" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
    </rpm:provides>
  </format>
</package>
<package type="rpm">
  <name>planta</name>
  <arch>x86_64</arch>
  <version epoch="0" ver="1.0" rel="1.fc29"/>
  <checksum type="sha256" pkgid="YES">7686cdfd416a33058736d4887ef7632dd653133f1643efe90967866c05849a5d</checksum>
  <summary>Made up package</summary>
  <description>planta description</description>
  <packager></packager>
  <url>None</url>
  <time file="1690201478" build="1690201478"/>
  <size package="6146" installed="0" archive="124"/>
  <location href="x86_64/planta-1.0-1.fc29.x86_64.rpm"/>
  <format>
    <rpm:license>GPLv3+</rpm:license>
    <rpm:vendor></rpm:vendor>
    <rpm:group>Unspecified</rpm:group>
    <rpm:buildhost>7e6d7c9478f2</rpm:buildhost>
    <rpm:sourcerpm>planta-1.0-1.fc29.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="4504" end="6097"/>
    <rpm:provides>
      <rpm:entry name="planta" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
      <rpm:entry name="planta(x86-64)" flags="EQ" epoch="0" ver="1.0" rel="1.fc29"/>
    </rpm:provides>
    <rpm:requires>
      <rpm:entry name="/a/path/to/water"/>
    </rpm:requires>
  </format>
</package>
</metadata>

During loading we call addfileprovides_queue and it fills the queue with /a/path/to/water which is stored in REPOSITORY_ADDEDFILEPROVIDES. (From my point of view this is a file dependency not a provide. Because only planta requires the file but nothing provides it.)

Then on the next run we also load filelists:

<?xml version="1.0" encoding="UTF-8"?>
<filelists xmlns="http://linux.duke.edu/metadata/filelists" packages="4">
<package pkgid="ade558f47b9c4ca4de9a53d90c48a84ef6c6007908e86bfb71f2e576b055ddb9" name="agua" arch="x86_64">
  <version epoch="0" ver="1.0" rel="1.fc29"/>
  <file>/a/path/to/water</file>
</package>
<package pkgid="7686cdfd416a33058736d4887ef7632dd653133f1643efe90967866c05849a5d" name="planta" arch="x86_64">
  <version epoch="0" ver="1.0" rel="1.fc29"/>
</package>

Here is the missing file provide but once addfileprovides_queue is called again it doesn't add it. I think because here its already included. Which results in not satisfied dependency breaking the repoclosure.

(Btw you shouldn't need to mess with the considered map in make_provides_ready() as you set the POOL_FLAG_WHATPROVIDESWITHDISABLED flag. Just saying...)

Thanks, I will look into it.

mlschroe commented 11 months ago

So it may happen that the full filelist is not available when you call pool_addfileprovides()? That won't work. You have to options:

1) do not set REPOSITORY_ADDEDFILEPROVIDES in that case as the input data was not complete.

2) set POOL_FLAG_ADDFILEPROVIDESFILTERED when creating the pool. This will make pool_addfileprovides only add the filtered file provides of the primary file, so the complete file list is not needed. It will also setup the provides data so that the entries not included in the primary file are added lazily on demand.

Option 2 makes sense if the filelist is only needed in rare cases. I don't know if this is true for your use cases.

j-mracek commented 11 months ago

2. FILEPROVIDESFILTERED when creating the pool. This will make pool_addfileprovides only add the filtered file provides of the primary file, so the complete file list is not needed. It will also setup the provides data so that the entries not included in the primary file are added lazily on demand.

May I ask you what you mean by on demand? When transaction fails because file provide not available or the demand is triggered by API call?

Additionally I have to mention that DNF5 does not even download file lists by default. The download of file lists is optional and configurable.

mlschroe commented 11 months ago

On demand means when libsolv needs to know what packages contain the file provides. I.e. when it is asked to resolve a file dependency.

The same mechanism is used when something like foo > 2.3 is resolved, it first checks if the result is cached and if not it goes through the packages providing 'foo' and matches the version part.

We could do that all the time for file provides, but it is much faster to first go through all packages and collect the needed file dependencies and then do one search through the filelist. That's exactly what pool_addfileprovides() does.

Now if only a few packages contain file dependencies outside of the primary data, it makes sense to use a hybrid scheme: resolve all file dependencies from primary right away and do searches for the missing ones. That way the complete filelist is not needed most of the time.

This can be used to solve the problem you have, because in this mode it does not matter if the complete filelist is available or not when pool_addfileprovides() is called.

j-mracek commented 11 months ago

@mlschroe Thank you for the nice explanation

kontura commented 11 months ago

Yes, thanks for the help! I made a dnf5 PR: https://github.com/rpm-software-management/dnf5/pull/777 to add the POOL_FLAG_ADDFILEPROVIDESFILTERED.