aptly-dev / aptly

aptly - Debian repository management tool
https://www.aptly.info/
MIT License
2.56k stars 374 forks source link

Possibility to omit some package checksum in the Packages file #1297

Closed RadxaYuntian closed 2 months ago

RadxaYuntian commented 3 months ago

apt will combining all supported checksum of a given version of package from all sources, which can cause issues when not all sources provide the same package.

Detailed Description

Debian 12's apt supports 3 checksums: MD5Sum, SHA256, and SHA512. However, not all package repository provides all 3 checksum. For example, in Debian 12 ARM64 repo, we only have MD5sum and SHA256:

$ curl "https://ftp.debian.org/debian/dists/bookworm/main/binary-arm64/Packages.xz" 2>/dev/null | xzcat | grep -A 17 "Package: gir1.2-gstreamer-1.0"
Package: gir1.2-gstreamer-1.0
Source: gstreamer1.0
Version: 1.22.0-2
Installed-Size: 380
Maintainer: Maintainers of GStreamer packages <gstreamer1.0@packages.debian.org>
Architecture: arm64
Depends: gir1.2-glib-2.0 (>= 0.9.12-4~), libgstreamer1.0-0 (>= 1.22.0)
Description: GObject introspection data for the GStreamer library
Multi-Arch: same
Homepage: https://gstreamer.freedesktop.org
Description-md5: 690d41f7ae6f89096e0ae65e4d4ffe68
Section: introspection
Priority: optional
Filename: pool/main/g/gstreamer1.0/gir1.2-gstreamer-1.0_1.22.0-2_arm64.deb
Size: 105260
MD5sum: 647cbd10708f8b7f8e6f6eb919ca992f
SHA256: 27fdf38a261cdedc1c1c1acb6482ccc9619d92812234356e7be9e7b399334b6b

However, some other package repository could provide something with same package name and version number, thus apt treat them as the same package, but with different content and checksums. For example, this is the checksum we have on our own apt repository generated by aptly:

$ curl "https://radxa-repo.github.io/rk3588-bookworm-test/dists/rk3588-bookworm-test/main/binary-arm64/Packages" 2>/dev/null | grep -A 15 "Package: gir1.2-gstreamer-1.0"
Package: gir1.2-gstreamer-1.0
Priority: optional
Section: introspection
Installed-Size: 380
Maintainer: Maintainers of GStreamer packages <gstreamer1.0@packages.debian.org>
Architecture: arm64
Source: gstreamer1.0
Version: 1.22.0-2
Depends: gir1.2-glib-2.0 (>= 0.9.12-4~), libgstreamer1.0-0 (>= 1.22.0)
Filename: pool/main/g/gstreamer1.0/gir1.2-gstreamer-1.0_1.22.0-2_arm64.deb
Size: 105260
MD5sum: e90a96151f083a5848f69c280edcb333
SHA1: faf16796fec42aa59a1b18012e13ae6a5447fb7c
SHA256: a1c30d1828d33edce098625a79ce951968a728a0b7db554bdecf3a9b004d313f
SHA512: e5b869633ed3a5e53f794ed348ce452b6395fd7450903e01cec7e7b660847632d1a377dce7c1cefb646c92b00d20be22f683098136e4c280c5c4a4d736bb00b3
Description: GObject introspection data for the GStreamer library

If the package is not pinned, and Debian's source (/etc/apt/sources.list.d/50-bookworm.list) is placed before our own source (/etc/apt/sources.list.d/80-radxa-rk3588.list), apt will download the package from Debian, but combine our SHA512 with Debian's MD5Sum and SHA1, and cause hash mismatch failure:

1306 upgraded, 367 newly installed, 27 to remove and 0 not upgraded. 
Need to get 105 kB/1161 MB of archives. 
After this operation, 1336 MB of additional disk space will be used. 
W: Sources disagree on hashes for supposely identical version '1.22.0-2' of 'gir1.2-gstreamer-1.0:arm64'. 
Do you want to continue? [Y/n]  
Get:1 https://deb.debian.org/debian bookworm/main arm64 gir1.2-gstreamer-1.0 arm64 1.22.0-2 [105 kB] 
Err:1 https://deb.debian.org/debian bookworm/main arm64 gir1.2-gstreamer-1.0 arm64 1.22.0-2 
  Hash Sum mismatch 
  Hashes of expected file: 
   - SHA256:27fdf38a261cdedc1c1c1acb6482ccc9619d92812234356e7be9e7b399334b6b 
   - MD5Sum:647cbd10708f8b7f8e6f6eb919ca992f [weak] 
   - Filesize:105260 [weak] 
   - SHA512:e5b869633ed3a5e53f794ed348ce452b6395fd7450903e01cec7e7b660847632d1a377dce7c1cefb646c92b00d20be22f683098136e4c280c5c4a4d736bb00b3 
  Hashes of received file: 
   - SHA512:67def12c3b6c060dfb50db8c9d9c5d52a6c6a35fad7734b0c9da58d8921bc6893ac121aa3a93287f0823829c91734a42f2bb56055df11a70f16f7286ccbb8e46 
   - SHA256:27fdf38a261cdedc1c1c1acb6482ccc9619d92812234356e7be9e7b399334b6b 
   - MD5Sum:647cbd10708f8b7f8e6f6eb919ca992f [weak] 
   - Filesize:105260 [weak] 
  Last modification reported: Sat, 28 Jan 2023 18:30:01 +0000 
Fetched 105 kB in 8s (12.4 kB/s)                                                
E: Failed to fetch https://deb.debian.org/debian/pool/main/g/gstreamer1.0/gir1.2-gstreamer-1.0_1.22.0-2_arm64.deb  Hash Sum mismatch 
   Hashes of expected file: 
    - SHA256:27fdf38a261cdedc1c1c1acb6482ccc9619d92812234356e7be9e7b399334b6b 
    - MD5Sum:647cbd10708f8b7f8e6f6eb919ca992f [weak] 
    - Filesize:105260 [weak] 
    - SHA512:e5b869633ed3a5e53f794ed348ce452b6395fd7450903e01cec7e7b660847632d1a377dce7c1cefb646c92b00d20be22f683098136e4c280c5c4a4d736bb00b3 
   Hashes of received file: 
    - SHA512:67def12c3b6c060dfb50db8c9d9c5d52a6c6a35fad7734b0c9da58d8921bc6893ac121aa3a93287f0823829c91734a42f2bb56055df11a70f16f7286ccbb8e46 
    - SHA256:27fdf38a261cdedc1c1c1acb6482ccc9619d92812234356e7be9e7b399334b6b 
    - MD5Sum:647cbd10708f8b7f8e6f6eb919ca992f [weak] 
    - Filesize:105260 [weak] 
   Last modification reported: Sat, 28 Jan 2023 18:30:01 +0000 
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

If we reverse the source order, then apt will use checksums all from our source, as well as download the package from our source, so everything will work fine.

As such, we'd like to see an option to omit some checksums when creating the package repository, so we can have better compatibility with Debian archives.

Context

We are a single board computer manufacturer. Our SoC vendor (Rockchip) provide a software SDK based on Debian, which is what we use for our system. They provide prebuilt packages with their own patches to work with their hardware.

Rockchip is kind enough to keep the package version number and package name exactly the same as the upstream package, and even across their own SDK updates. We have to use various tricks to force apt to pick the right packages. Needless to say we try to get rid of their stuff whenever possible.

One such thing we got rid of recently is the GPU driver, so we can now use the upstream graphic stack, so we removed some hacks. This is when we discovered this issue.

We provide the basic instruction to add our package repository on Debian system for anyone to test. You can see is still using the old 20 numbering, which is before the 50 we assigned to the Debian archives. We now use 70 and 80 to have more packages to be installed from Debian archive, and thus triggered the issue.

Possible Implementation

Your Environment

The issue is purely on the client side, so it shouldn't matter. But for what is worth it is running Debian 12.

aptly comes directly from repo.aptly.info and is running on ubuntu-latest GitHub runner.

RadxaYuntian commented 3 months ago

Currently this hack fixes the issue for us: https://github.com/RadxaOS-SDK/aptly/commit/64381b48c0b00f2793dcfe6caa1043c9109a2189

neolynx commented 3 months ago

Hi !

Thanks for reporting, this is indeed an interesting issue.

But I wonder if this isn't more of a apt problem than a aptly problem ? Of course we can make the selection of hashes configurable somehow, as your change proves, but this also kinda degrades the security.

A few questions to your setup:

RadxaYuntian commented 3 months ago

is this a aptly mirror of the official debian/bookworm: https://radxa-repo.github.io/rk3588-bookworm-test ?

No it is not. It is mostly used to host our suppliers' patched packages.

the apt sources of the system contains offical debian repo and the radxa-repo.github.io/rk3588-bookworm-test ?

Correct, since some hardware features are not supported in the upstream packages, our suppliers patched them and provided the build output in the form of deb packages to be installed on the system instead.

why do you think the checksum fails ? it should be the same file and have the same checksums ? (is this a apt bug?)

See below.

But I wonder if this isn't more of a apt problem than a aptly problem ?

I think the issue is caused by all checksum are attempted to be saved without checking if there is already data there, but the same type can only be added once. This explains the behavior we observed when tweaking the source order. However, the standard for multiple instances of the same version packages is to treat them as fallback:

Several instances of the same version of a package may be available when the sources.list(5) file contains references to more than one source. In this case apt-get downloads the instance listed earliest in the sources.list(5) file. The APT preferences do not affect the choice of instance, only the choice of version.

My understanding is that this implicitly requires the fallback to be the exact copy for this policy (and the checksum handling) to make sense, which is not the case we are having.

To rephrase, the issue is caused by our supplier provide us packages that pretend to be the upstream packages (since it has the same version) when they are not. This is obviously not a use case apt itself should consider.

However, we have found that when removing the extra checksums that are missing from the upstream archive, apt can now correctly install upstream packages without above checksum error. We can still use apt_preferences to pin specific packages to our suppliers' variant, and sudo apt install cheese/rk3588-bookworm-test installs correctly from our vendor archive when user wants to manually override the package selection. This is good enough for us, which is why we want aptly to support this odd option.

neolynx commented 2 months ago

Thanks for the explanation. This sounds more like a workaround I would prefer to not implement in aptly. Also it should probably be selective per repo/mirror or even package, which sounds a bit complicated.

I would rather suggest to fix the source of the problem. I would assume the proper way of providing such a package, would be to choose a different package name (i.e. including vendor name) and have that package Provide the original package, so it can be replaced. Also the version should include the vendor name:

gir1.2-gstreamer-1.0-vendor 1.22.0-2-vendor

In case the vendor is not able to provide more compatible packages, I wonder if it is possible for you to download all affected files, and repackaged them before uploading to aptly ?

This could be done with a simple script vendorize-deb:

#!/bin/sh
set -e

usage()
{
    echo "$0: <deb package> [vendor name]" >&2
    exit 1
}

DEB=$1
if [ -z "$DEB" ]; then
    usage
fi
DEB=`realpath $DEB`

VEN=$2
if [ -z "$VEN" ]; then
    VEN=vendor
fi

t=`mktemp -d /tmp/vendorize-deb-XXXXX`
finish()
{
    rm -rf $t
}
trap finish EXIT

cd $t
ar x $DEB
tar xf control.tar.xz

PKG=`sed -n 's/Package: \(.\+\)/\1/p' control`
sed -i "2 i\\Provides: $PKG" control
sed -i "/^Package: / s/$/-$VEN/" control
sed -i "/^Source: / s/$/-$VEN/" control
sed -i "/^Version: / s/$/-$VEN/" control

tar cJf control.tar.xz control md5sums
rm -f control md5sums
dname=`basename $DEB | cut -d_ -f1`
dver=`basename $DEB | cut -d_ -f2`
drest=`basename $DEB | cut -d_ -f3-`
ddir=`dirname $DEB`
OUT=$ddir/$dname-${VEN}_$dver-${VEN}-$drest
ar rcs $OUT debian-binary control.tar.xz data.tar.xz
echo created $OUT

Example:

$ ./vendorize-deb ../gir1.2-gstreamer-1.0_1.22.0-2_arm64.deb test
created /home/test/Downloads/gir1.2-gstreamer-1.0-test_1.22.0-2-test-arm64.deb

$ dpkg -I /home/test/Downloads/gir1.2-gstreamer-1.0-test_1.22.0-2-test-arm64.deb
 new Debian package, version 2.0.
 size 105260 bytes: control archive=1072 bytes.
    1039 bytes,    25 lines      control              
     737 bytes,     8 lines      md5sums              
 Package: gir1.2-gstreamer-1.0-test
 Provides: gir1.2-gstreamer-1.0
 Source: gstreamer1.0-test
 Version: 1.22.0-2-test
[...]

Would this be a feasible way for you ?

RadxaYuntian commented 2 months ago

Thanks for the suggestion. We do patch selected packages directly when absolutely necessary, but broadly changing every packages' name would need more planning to ensure it won't break existing users and the inter-package dependencies is correct (so the vendor package won't pull upstream packages as its dependencies).

We have wanted to patch package's control for a while for a different reason: vendor won't update the package version when new SDK is released, so patching Version is the only proper way for our users to get SDK package update. By doing this at least apt won't treat both packages as the same and give us checksum error.

For now we will use the patched aptly until we implement the version patching scheme.