anatol / pacoloco

Caching proxy server for Arch Linux pacman
MIT License
199 stars 30 forks source link

Only one db file can exist for a repo with multiple arches, and incorrect db would be sent to client #110

Closed 7Ji closed 2 months ago

7Ji commented 2 months ago

When a repo provides multiple arches, e.g. archlinuxcn, arch4edu, Arch Linux ARM's official repos, and my 7Ji repo, and such repo uses the same name for different arches, there would only be one path for db file, e.g. /var/cache/pacoloco/pkgs/archlinuxarm/core.db for both aarch64/core/core.db and armv7h/core/core.db, and it's impossible to tell which arch it's for.

If clients with different arches use such repos cached by pacoloco, there would be chances that the db and thus packages previously requested by another architecture and cached locally would be served by pacoloco. This happened to me a couple of times and broke some of my ALARM boxes and I had to resort to a config like following:

repos:
  # All repos are configured with arch suffix, as pacoloco would only cache one [repo].db
  # e.g. If there're users fetching both archlinuxcn x86_64 and archlinuxcn aarch64, they would be 
  #    stored both under /var/cache/pacoloco/pkgs/archlinuxcn, and their db file would replace
  #    each other, as both being /var/cache/pacoloco/pkgs/archlinuxcn/archlinuxcn.db
  # Arch Linux: x86_64
  archlinux:x86_64: # In case Arch Linux decides to support more arches some day
    urls: &urls_archlinux
      - http://mirrors.wsyu.edu.cn/archlinux
      - http://mirrors.ustc.edu.cn/archlinux
      - http://mirrors.tuna.tsinghua.edu.cn/archlinux

  # Arch Linux ARM: aarch64, armv7h
  archlinuxarm:aarch64:
    urls: &urls_archlinuxarm
      - http://mirror.archlinuxarm.org/archlinuxarm
      - http://mirrors.ustc.edu.cn/archlinuxarm
      - http://mirrors.tuna.tsinghua.edu.cn/archlinuxarm
  archlinuxarm:armv7h:
    urls: *urls_archlinuxarm

  # Arch Linux CN: aarch64, any, arm, armv6h, armv7h i686, x86_64
  archlinuxcn:aarch64:
    urls: &urls_archlinuxcn
      - http://mirrors.wsyu.edu.cn/archlinuxcn
      - http://mirrors.ustc.edu.cn/archlinuxcn
      - http://mirrors.tuna.tsinghua.edu.cn/archlinuxcn
  archlinuxcn:any:
    urls: *urls_archlinuxcn
  archlinuxcn:arm:
    urls: *urls_archlinuxcn
  archlinuxcn:armv6h:
    urls: *urls_archlinuxcn
  archlinuxcn:armv7h:
    urls: *urls_archlinuxcn
  archlinuxcn:i686:
    urls: *urls_archlinuxcn
  archlinuxcn:x86_64:
    urls: *urls_archlinuxcn

  # Arch 4 Edu: aarch64, any, x86_64
  arch4edu:aarch64:
    urls: &urls_arch4edu
      - http://mirrors.ustc.edu.cn/arch4edu
      - http://mirrors.tuna.tsinghua.edu.cn/arch4edu
  arch4edu:any:
    urls: *urls_arch4edu
  arch4edu:x86_64:
    urls: *urls_arch4edu

With the following nginx config to rewrite request URL:

server {
    listen 80;
    charset UTF-8;
    server_name repo.rz5.lan;

    # for alarm and 3rd partys
    rewrite ^/(archlinuxarm|archlinuxcn|arch4edu)/([^/]+)/(.+)$ /.repo/$1:$2/$2/$3 last;
    # for arch official
    rewrite ^/(archlinux)/([^/]+)/os/([^/]+)/(.+)$ /.repo/$1:$3/$2/os/$3/$4 last;

    # locally available repos
    location / {
        autoindex on;
        autoindex_exact_size off;
        autoindex_localtime on;

        root /srv/http/repo;
        add_after_body /.res/footer.html;
    }

    # pacoloco
    location /.repo {
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_pass http://127.0.0.1:9129/repo;
    }
}

This works, but it's very hacky. Maybe we could produce names under /var/cache/pacoloco/pkgs/[top repo] with hashes instead of raw file names to avoid such issues?