anatol / pacoloco

Caching proxy server for Arch Linux pacman
MIT License
216 stars 30 forks source link

add doc about CacheServer #105

Open solsticedhiver opened 6 months ago

chennin commented 6 months ago

What's the benefit of using CacheServer instead? I'd appreciate some words in the doc on why one might use CacheServer or Server.

If a client requests a package via CacheServer, I assume pacoloco will still go get it and download it and return it? So there would be no difference from the client POV?

My setup is 3 Arch PCs that are not always on, and another (docker container running pacoloco) that is always on. The 3 clients have pacoloco as a Server. I'm trying to figure out if I should use CacheServer instead.

solsticedhiver commented 6 months ago

Well, it's a pacman.conf option, so I would have expect anyone wondering about it to look at the man page of pacman.conf to find out about this.

I can add something like Please refer to pacman.conf man page to learn more about this option. If you really want to...

And if you want to know, the benefit, is that if the cache server fails, pacman use the main mirror specified in Server instead. And always download the db and sig file from the mirror in server, I think. So you better use th esame mirror in pacman and pacoloco...

chennin commented 6 months ago

But AIUI if a regular Server fails, then pacman also tries the next one, which is why we can specify multiple mirrors in pacman.conf. And the db/sig files are small, right?

When setting up clients to use pacoloco, why would I prefer CacheServer over Server? Why do we want to add this to the pacoloco documentation? What does the pacoloco project recommend? That's what I think is valuable to add to the documentation, not only the fact that it's possible.

solsticedhiver commented 6 months ago

Well, a server specified as CacheServer is not removed from server pool for 404 downloadad erros (like said in the man page).

If you use a cache server, and the file is not there, pacman falls back to use the main mirror instead. That's what it is meant for.

This is less relevant for pacoloco, as pacoloco will download the missing file and never return a 404.

So this is only useful, if your pacoloco is on your LAN, like me, this will not fail when you try an upgrade pacman -Syu when away from your LAN. (without having to change the Server option)

But you could find other scenario, may be

krameler commented 6 months ago

When setting up clients to use pacoloco, why would I prefer CacheServer over Server?

I agree that that is a question that should be answered in the doc when presenting this alternative.

What isn't in the man pages is that CacheServer still get dropped like a normal Server if Pacman encounters a hard error like a failure to connect. Pacman also doesn't seem to print any errors regarding with a CacheServer, even when removing it.

I'd suggest roughly: Set everything up using the Server option so you can use the errors for troubleshooting. And when everything is configured correctly, switch to CacheServer for robustness and error-free output for when Pacoloco isn't available (because you're not at home or your server is down).

This is less relevant for pacoloco, as pacoloco will download the missing file and never return a 404.

I actually see two cases where pacoloco sends a 404:

  1. If all of the mirrors is Pacoloco using are lagging behind the mirror the client uses for the db and aren't yet able to serve the file. Which should be a rare case, I think.
  2. If the client requests a repository that Pacoloco isn't configured for. Which might be a configuration mistake that needs to be rectified.

But one could also set Pacoloco as a blanket CacheServer for everything without it getting removed because it doesn't serve everything. This might simplify pacman configuration.

solsticedhiver commented 6 months ago

I have just added the minimum for now.

I tried to write something, but I don't really know on what to write without going on a full length paragraph.

May be, getting pacman dev to be more explicit on what CacheServer does on pacman.conf man page would be better?

krameler commented 6 months ago

The problem isn't that the man page isn't informative enough, it's that a user of pacoloco would want to know how this setting interacts with pacoloco, which the pacman devs won't care about.

And I don't think it's bad to have a full paragraph to properly present an option.

solsticedhiver commented 6 months ago

Changed the wording and add a note about pacoloco's log.

Give me an example of what you want to see, if you want more exaplantion. Please.

anatol commented 5 months ago

Another difference between CacheServer and Server is that former is not used for database files. If CacheServer is used for a pacoloco then database files will always be fetched from the upstream servers. This reduces a level of cacheability.

balki commented 1 month ago

I think the idea of CacheServer of pacman is as follows

Server: Slower but trustworthy CacheServer: Faster but less trustworthy

Since db is not downloaded from CacheServer, client would not miss any critical security update even if the CacheServer is not up to date or is malicious. This is useful for example, a cache server just hosts pacman cache from another archlinux machine.

I don't think this option makes sense for self-hosted pacoloco as it actively fetches all the files.