nroi / flexo

a central pacman cache
MIT License
172 stars 10 forks source link

Local packages are newer than in a repository #73

Open Zebradil opened 2 years ago

Zebradil commented 2 years ago

After switching to flexo I start seeing such warning messages during system upgrades:

warning: python-more-itertools: local (8.10.0-1) is newer than community (8.9.0-1)
warning: python-poetry: local (1.1.10-1) is newer than community (1.1.9-1)
warning: python-traitlets: local (5.1.0-1) is newer than community (5.0.5-2)

The list of warnings changes if I re-run an upgrade. Sometimes it mentions a few packages, sometimes it mentions a lot of packages. I also noticed that so far those warnings appear only for a particular database (extra or community).

As far as I understood, there is currently no way to tell flexo to prioritize more up-to-date mirrors. I can manually select mirrors for the predefined list, but I'm concerned that it'll require manual maintenance in the future.

Is it make sense to amend the logic of selecting mirrors to make it to pay attention to delay values?

UPD: I checked the mirror (https://mirror.moson.org/arch/) that was used last time and emitted the warnings for community database. The delay value for it is 0:04 which is quite small already. However, I'm not sure that this particular mirror was used to download community database. In the logs I see that there was an attempt to download community.db.sig file from it, that's why I thought that it was also used to download the database file too.

Any ideas what could be the problem?

Zebradil commented 2 years ago

For now I set predefined mirrors generated with reflector -f5 --delay 0.1 --sort age -p https. I'll give it a try for some time to see if there are any issues.

nroi commented 2 years ago

Thanks for reporting this!

Is it make sense to amend the logic of selecting mirrors to make it to pay attention to delay values?

I think so. Flexo already makes use of the score attribute to filter out unusable mirrors, but it seems that this does not always exclude out of date mirrors. For example, I have just now (2021-09-26T09:10:00Z) examined the mirrorlist from https://archlinux.org/mirrors/status/json/ and found the following mirror:

{
    "url": "https://arlm.tyzoid.com/",
    "protocol": "https",
    "last_sync": "2021-09-26T03:08:30Z",
    "completion_pct": 0.759493670886076,
    "delay": 114,
    "duration_avg": 0.50416832168897,
    "duration_stddev": 0.2286094097319996,
    "score": 1.0065184574820543,
    "active": true,
    "country": "United States",
    "country_code": "US",
    "isos": true,
    "ipv4": true,
    "ipv6": true,
    "details": "https://archlinux.org/mirrors/arlm.tyzoid.com/1028/"
}

Notice that the score is very decent (lower is better), but the last_sync is already a few hours old, much older than most mirrors. So this is definitely something that needs to be improved in Flexo, it should not be using mirrors that are much older than most mirrors.

I checked the mirror (https://mirror.moson.org/arch/) that was used last time and emitted the warnings for community database.

According to https://archlinux.org/mirrors/moson.org/1566/, this mirror was always up-to-date in the last few days, so I'm not entirely sure what to make out of this. Maybe it was another mirror that Flexo has used to serve the database file. But in any case, using the last_sync attribute for mirror selection is a good idea, so I'm going to implement this and see if it solves the issue.

Zebradil commented 2 years ago

Thank you for the great tool and for the support!

According to https://archlinux.org/mirrors/moson.org/1566/, this mirror was always up-to-date in the last few days, so I'm not entirely sure what to make out of this. Maybe it was another mirror that Flexo has used to serve the database file.

Yes, in the logs on info level it's not explicitly said from which mirror a particular *.db file was downloaded. But we can see attempts to download *.db.sig files from particular mirrors. That's why I made the assumption regarding the source or community.db. In this context I have a question: shouldn't *.db.sig files be downloaded from the same mirror as corresponding *.db files? Otherwise, I can imagine a situation when two mirrors are not in sync and getting *.db from one mirror and *.db.sig from the other will lead to signature validation error.

nroi commented 2 years ago

In this context I have a question: shouldn't *.db.sig files be downloaded from the same mirror as corresponding *.db files? Otherwise, I can imagine a situation when two mirrors are not in sync and getting *.db from one mirror and *.db.sig from the other will lead to signature validation error.

Yes, that's correct, but it's not a problem at this moment because the mirrors don't provide db.sig files anyway. With or without Flexo, pacman just receives 404's and ignores them when it attempts to fetch those files.

But I need to put some thought into how to improve the downloads of those database files to avoid those kinds of errors. I currently imagine something like a "mirror-stickiness" for database files where it chooses one primary mirror to fetch all database files (and db.sig files), and then doesn't change the mirror unless there's a good reason to do so.

nroi commented 2 years ago

I was also able to reproduce this issue by running pacman -Syu twice. I guess I never had this issue before because I update my system only once a week or so.