anatol / pacoloco

Caching proxy server for Arch Linux pacman
MIT License
216 stars 30 forks source link

Breaking compatibility with old db? #86

Closed Focshole closed 1 year ago

Focshole commented 1 year ago

I just tried to update to the new master branch. I see those errors:

lug 29 15:03:13 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//core/os/x86_64/core.db
lug 29 15:03:13 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//multilib/os/x86_64/multilib.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:168: repo archlinux has no urls
lug 29 15:03:13 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//extra/os/x86_64/extra.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/multilib/os/x86_64/multilib.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/core/os/x86_64/core.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/extra/os/x86_64/extra.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:168: unable to download file archlinux/multilib/os/x86_64/multilib.db.sig
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:168: unable to download file archlinux/extra/os/x86_64/extra.db.sig
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:168: unable to download file archlinux/core/os/x86_64/core.db.sig

Does the recent commits break compatibility with the older db? Is it correct to serve cached db files when they exist? Is there something that can be done to update the db?

anatol commented 1 year ago

The downloader functionality has changed a lot recently. It might affect the way prefetcher works.

As of your question Is it correct to serve cached db files when they exist? - for *.db filespacoloco checks upstream modification time, if the upstream has newer database it doesnloads it. Otherwise pacoloco serves locally cached database file.

Is there something that can be done to update the db?

Once upstream has a newer database file then pacoloco will download it. Let me know if you see different behaviour.

Focshole commented 1 year ago

Let me know if you see different behaviour.

The behaviour I see is that all downloads fail on the client, which then discards pacoloco for further downloads. Is there any migration that should be done on the db to work with the newer version?

anatol commented 1 year ago

It might be easier to debug the issue if you provide a way to reproduce the problem. What is the step by step instructions to get this problem?

Focshole commented 1 year ago

Suddenly today after a reboot it worked. I write here a small sample of logs that show the issue. It is much more a log readibility issue rather than a real issue. I restarted pacoloco after the update

lug 29 15:01:52 testmachine systemd[1]: Stopping Pacoloco caching proxy server...
lug 29 15:01:52 testmachine systemd[1]: pacoloco.service: Deactivated successfully.
lug 29 15:01:52 testmachine systemd[1]: Stopped Pacoloco caching proxy server.
lug 29 15:01:52 testmachine systemd[1]: pacoloco.service: Consumed 1min 16.964s CPU time.
lug 29 15:01:52 testmachine systemd[1]: Started Pacoloco caching proxy server.
lug 29 15:01:52 testmachine pacoloco[1083687]: pacoloco.go:98: Reading config file from /etc/pacoloco.yaml
lug 29 15:01:52 testmachine pacoloco[1083687]: prefetch.go:34: The prefetching routine will be run on .....
lug 29 15:01:52 testmachine pacoloco[1083687]: pacoloco.go:141: Starting server at port 61637...

Then the real issue warning came out:

lug 29 15:02:56 testmachine pacoloco[1083687]: downloader.go:338: repo archlinux has no urls
lug 29 15:02:56 testmachine pacoloco[1083687]: pacoloco.go:168: repo archlinux has no urls

Weird things happened afterwards as I requested a db update on a pacman client:


lug 29 15:02:56 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//multilib/os/x86_64/multilib.db
lug 29 15:02:56 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//extra/os/x86_64/extra.db
lug 29 15:02:56 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//core/os/x86_64/core.db
lug 29 15:02:58 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/core/os/x86_64/core.db
lug 29 15:02:58 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/multilib/os/x86_64/multilib.db

An url for download was available even if there were no urls for that repository. How could this be possible?

The same thing happened with the other dbs:


lug 29 15:03:13 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//core/os/x86_64/core.db
lug 29 15:03:13 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//multilib/os/x86_64/multilib.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:168: repo archlinux has no urls
lug 29 15:03:13 testmachine pacoloco[1083687]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//extra/os/x86_64/extra.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/multilib/os/x86_64/multilib.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/core/os/x86_64/core.db
lug 29 15:03:13 testmachine pacoloco[1083687]: pacoloco.go:270: serving cached file for archlinux/extra/os/x86_64/extra.db

But, something even harder to understand is how could have possibly completed a prefetch with no urls! Here are the logs:

lug 30 03:00:00 testmachine pacoloco[1091120]: prefetch.go:258: Starting prefetching routine...
lug 30 03:00:00 testmachine pacoloco[1091120]: prefetch.go:182: Cleaning the db...
lug 30 03:00:00 testmachine pacoloco[1091120]: prefetch.go:216: Db cleaned.
lug 30 03:00:00 testmachine pacoloco[1091120]: prefetch.go:263: Starting prefetching packages...
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:234: An error occurred for mirror {/repo/archlinux/core/os/x86_64/core.db archlinux 2023-07-29 15:03:13.320854111 +0200 +0200} :repo archlinux has no urls
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:234: An error occurred for mirror {/repo/archlinux/extra/os/x86_64/extra.db archlinux 2023-07-29 15:03:13.323998469 +0200 +0200} :repo archlinux has no urls
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:234: An error occurred for mirror {/repo/archlinux/community/os/x86_64/community.db archlinux 2023-07-16 06:00:01.198244598 +0200 +0200} :repo archlinux has no urls
lug 30 03:00:00 testmachine pacoloco[1091120]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//multilib/os/x86_64/multilib.db
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:159: Extracting /opt/pcache/tmp-db/multilib.db...
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:171: Parsing /opt/pcache/tmp-db/multilib.db.tar...
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:173: Parsed /opt/pcache/tmp-db/multilib.db.tar.
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:180: Adding entries to db...
lug 30 03:00:00 testmachine pacoloco[1091120]: repo_db_mirror.go:216: Added entries to db.
lug 30 03:00:00 testmachine pacoloco[1091120]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//multilib/os/x86_64/lib32-curl-8.2.1-1-x86_64.pkg.tar.zst
lug 30 03:00:00 testmachine pacoloco[1091120]: prefetch.go:241: Successfully prefetched lib32-curl-x86_64 package
lug 30 03:00:00 testmachine pacoloco[1091120]: downloader.go:104: downloading https://mirror.ubrco.de/archlinux//multilib/os/x86_64/lib32-curl-8.2.1-1-x86_64.pkg.tar.zst.sig
lug 30 03:00:00 testmachine pacoloco[1091120]: prefetch.go:239: Successfully prefetched lib32-curl-x86_64 signature
....
....
lug 30 03:00:16 testmachine pacoloco[1091120]: prefetch.go:265: Finished prefetching packages!
lug 30 03:00:16 testmachine pacoloco[1091120]: prefetch.go:266: Finished prefetching routine!
lug 30 03:00:16 testmachine pacoloco[1091120]: prefetch.go:45: On 2023-07-31 02:59:59.999999892 +0200 CEST m=+128737.435440949 the prefetching routine will be run again

Now, after a reboot I use reflector to update mirrorlists. I noticed that reflector failed on the previous run, thus providing an empty mirrorlist (my fault ofc). After the reboot, everything went smoothly, no issues whatsoever.

Imho I'd suggest to replace the repo _name_ has no urls error message to mirrorlist for repo _name_ is empty or unreadeable if a mirrorlist is being used.

It is just a minor issue but it would have helped a lot finding the issue earlier imho.

anatol commented 1 year ago

To prevent issues with invalid mirrolist I thing we need to update getMirrorlistURLs() and change the code to

urls, err := parseMirrorlistURLs(file)
        if err == nil {
                r.URLs = urls
        }
        if len(urls) == 0 {
          return nil, fmt.Errorf("mirrolist file %v contains no urls")
        }
        return urls, err