rpm-software-management / rpm

The RPM package manager
http://rpm.org
Other
501 stars 360 forks source link

Checksum test failure on Ubuntu #2874

Open nwalfield opened 8 months ago

nwalfield commented 8 months ago

I'm noticing a test suite failure when building 4.18.2. See this CI output

  273. rpmsigdig.at:157: testing rpmkeys -Kv <unsigned> 2 ...
  ../../rpm/tests/rpmsigdig.at:159:

  if ! [ -d testing/ ]; then
      cp -aP "${RPMTEST}" .
      chmod -R u+w testing/
      mkdir -p testing/build
      ln -s ../data/SOURCES testing/build/
  fi
  export RPMTEST="${PWD}/testing"
  export TOPDIR="${RPMTEST}/build"
  export HOME="${RPMTEST}"

  rm -rf "${RPMTEST}"`rpm --eval '%_dbpath'`/*
  runroot rpm --initdb

  runroot rpmbuild -bb --quiet \
    --define "optflags -O2 -g" \
    --define "_target_platform noarch-linux" \
    --define "_binary_payload w.ufdio" \
    --define "_buildhost localhost" \
    --define "use_source_date_epoch_as_buildtime 1" \
    --define "source_date_epoch_from_changelog 1" \
    --define "clamp_mtime_to_source_date_epoch 1" \
    /data/SPECS/attrtest.spec
  for v in SHA256HEADER SHA1HEADER SIGMD5 PAYLOADDIGEST PAYLOADDIGESTALT; do
      runroot rpm -q --qf "${v}: %{${v}}\n" /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm
  done
  runroot rpmkeys -Kv /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm

  --- - 2024-01-22 14:50:58.866214647 +0000
  +++ /home/runner/work/rpm-sequoia/rpm-sequoia/rpm-build/tests/rpmtests.dir/at-groups/273/stdout   2024-01-22 14:50:58.859174628 +0000
  @@ -1,8 +1,8 @@
  -SHA256HEADER: eb512d3d8c282d0249701032591c53ffb5904c54c95de04783028387b224d8fe
  -SHA1HEADER: a42c611d67870c1937623f0da2631eabdf33e948
  -SIGMD5: 88d1037686ed3f5f6b67618b02cc47ef
  -PAYLOADDIGEST: 116ce41ebb72f1877cda3d7dedaf5b78770e202d6389ade4e415d78548d703a8
  -PAYLOADDIGESTALT: 116ce41ebb72f1877cda3d7dedaf5b78770e202d6389ade4e415d78548d703a8
  +SHA256HEADER: 94d13620f7058c14f24605c1461a9ef89b5b50b80c421a0a0eb7f0c62fe0f638
  +SHA1HEADER: 8036a9b66aa7781e4000a441e695bb076acfc450
  +SIGMD5: 98d3343d19052974392ed389e121f4f8
  +PAYLOADDIGEST: 91438332ac8fe92e4d4fcd45edb64b659323b893d9496a339f8587d19d00531a
  +PAYLOADDIGESTALT: 91438332ac8fe92e4d4fcd45edb64b659323b893d9496a339f8587d19d00531a
   /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm:
       Header SHA256 digest: OK
       Header SHA1 digest: OK
  273. rpmsigdig.at:157: 273. rpmkeys -Kv <unsigned> 2 (rpmsigdig.at:157): FAILED (rpmsigdig.at:159)

I'm seeing the same failure when building locally. (I'm following the instructions here).

I'd appreciate any tips on how to debug this.

dmnks commented 8 months ago

When you build locally, do you see the same failure also without the patch associated with the PR in that CI job?

This test has hardcoded checksums to test build reproducibility (with SOURCE_DATE_EPOCH clamping) so whenever the RPM version changes in configure.ac (or CMakeLists.txt in 4.19 and later), the checksums change as well since the RPM version is baked into the header when building packages. We typically adjust these checksums manually as part of a release bumping commit.

There probably are other factors that also cause these checksums to change but I can't think of anything right now.

dmnks commented 8 months ago

One thing I noticed is that, in this case, even the payload checksums have changed. An RPM version bump would only affect the header checksums... So there must be something else at play here.

nwalfield commented 8 months ago

When you build locally, do you see the same failure also without the patch associated with the PR in that CI job?

This test has hardcoded checksums to test build reproducibility (with SOURCE_DATE_EPOCH clamping) so whenever the RPM version changes in configure.ac (or CMakeLists.txt in 4.19 and later), the checksums change as well since the RPM version is baked into the header when building packages. We typically adjust these checksums manually as part of a release bumping commit.

There probably are other factors that also cause these checksums to change but I can't think of anything right now.

I reset the rpm-sequoia branch to the last release (v1.5.0), and I see the same error.

us@alice:/tmp/rpm/rpm-sequoia$ git reset --hard v1.5.0
HEAD is now at f2e5429 Release 1.5.0.
us@alice:/tmp/rpm/rpm-sequoia$ PREFIX=/usr LIBDIR="\${prefix}/lib64"   cargo build --release && cargo test --release
...
us@alice:/tmp/rpm/rpm/b/tests$ export PKG_CONFIG_PATH=/tmp/rpm/rpm-sequoia/target/release
us@alice:/tmp/rpm/rpm/b/tests$ export LD_LIBRARY_PATH=/tmp/rpm/rpm-sequoia/target/release
us@alice:/tmp/rpm/rpm/b/tests$ ../../tests/rpmtests 273
## ---------------------- ##
## rpm 4.18.2 test suite. ##
## ---------------------- ##
273: rpmkeys -Kv <unsigned> 2                        FAILED (rpmsigdig.at:159)

## ------------- ##
## Test results. ##
## ------------- ##

ERROR: 1 test was run,
1 failed unexpectedly.
## ------------------------- ##
## rpmtests.log was created. ##
## ------------------------- ##

Please send `tests/rpmtests.log' and all information you think might help:

   To: <rpm-maint@lists.rpm.org>
   Subject: [rpm 4.18.2] rpmtests: 273 failed

You may investigate any problem if you feel able to do so, in which
case the test suite provides a good starting point.  Its output may
be found below `tests/rpmtests.dir'.

us@alice:/tmp/rpm/rpm/b/tests$ cat rpmtests.dir/273/rpmtests.log
#                             -*- compilation -*-
273. rpmsigdig.at:157: testing rpmkeys -Kv <unsigned> 2 ...
../../tests/rpmsigdig.at:159:

if ! [ -d testing/ ]; then
    cp -aP "${RPMTEST}" .
    chmod -R u+w testing/
    mkdir -p testing/build
    ln -s ../data/SOURCES testing/build/
fi
export RPMTEST="${PWD}/testing"
export TOPDIR="${RPMTEST}/build"
export HOME="${RPMTEST}"

rm -rf "${RPMTEST}"`rpm --eval '%_dbpath'`/*
runroot rpm --initdb

runroot rpmbuild -bb --quiet \
        --define "optflags -O2 -g" \
        --define "_target_platform noarch-linux" \
        --define "_binary_payload w.ufdio" \
        --define "_buildhost localhost" \
        --define "use_source_date_epoch_as_buildtime 1" \
        --define "source_date_epoch_from_changelog 1" \
        --define "clamp_mtime_to_source_date_epoch 1" \
        /data/SPECS/attrtest.spec
for v in SHA256HEADER SHA1HEADER SIGMD5 PAYLOADDIGEST PAYLOADDIGESTALT; do
    runroot rpm -q --qf "${v}: %{${v}}\n" /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm
done
runroot rpmkeys -Kv /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm

--- -   2024-01-25 09:03:34.282017503 +0000
+++ /tmp/rpm/rpm/b/tests/rpmtests.dir/at-groups/273/stdout      2024-01-25 09:03:34.277556707 +0000
@@ -1,8 +1,8 @@
-SHA256HEADER: eb512d3d8c282d0249701032591c53ffb5904c54c95de04783028387b224d8fe
-SHA1HEADER: a42c611d67870c1937623f0da2631eabdf33e948
-SIGMD5: 88d1037686ed3f5f6b67618b02cc47ef
-PAYLOADDIGEST: 116ce41ebb72f1877cda3d7dedaf5b78770e202d6389ade4e415d78548d703a8
-PAYLOADDIGESTALT: 116ce41ebb72f1877cda3d7dedaf5b78770e202d6389ade4e415d78548d703a8
+SHA256HEADER: 94d13620f7058c14f24605c1461a9ef89b5b50b80c421a0a0eb7f0c62fe0f638
+SHA1HEADER: 8036a9b66aa7781e4000a441e695bb076acfc450
+SIGMD5: 98d3343d19052974392ed389e121f4f8
+PAYLOADDIGEST: 91438332ac8fe92e4d4fcd45edb64b659323b893d9496a339f8587d19d00531a
+PAYLOADDIGESTALT: 91438332ac8fe92e4d4fcd45edb64b659323b893d9496a339f8587d19d00531a
 /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm:
     Header SHA256 digest: OK
     Header SHA1 digest: OK
273. rpmsigdig.at:157: 273. rpmkeys -Kv <unsigned> 2 (rpmsigdig.at:157): FAILED (rpmsigdig.at:159)
dmnks commented 8 months ago

Hmm, that's indeed strange. I tried to reproduce the same myself (using the steps you provided, thanks!) but could not - the test passed for me.

Could you try doing this against a new, fresh build of RPM in another directory?

pmatilai commented 8 months ago

If guessing and trying fails, 'runroot rpm -qp --xml /build/RPMS/noarch/attrtest-1.0-1.noarch.rpm' is your friend, one can then diff the output of that between working and non-working files. We should probably dump that output anyway to make it more debuggable.

nwalfield commented 8 months ago

Hmm, that's indeed strange. I tried to reproduce the same myself (using the steps you provided, thanks!) but could not - the test passed for me.

Could you try doing this against a new, fresh build of RPM in another directory?

I did one better, I tried for a fresh Fedora 39 VM :D.

There it works for me.

Perhaps the issue is that rpm-sequoia's CI is running the test suite from Ubuntu. Do you think that could be a cause of the problem?

pmatilai commented 8 months ago

I remember a case or two where the checksums mismatch due to different libmagic versions producing different strings. I also remember tweaking the test to avoid relying on libmagic stuff there, but don't remember when exactly. But yes, it's fragile. Very.

pmatilai commented 8 months ago

4.18.2 would be missing at least this: 7ec148c1d61e0b526ae5c917f0ddc2b4a3222146 which could affect it.

dmnks commented 8 months ago

Yup, and the CI running on Ubuntu could indeed have an effect on the checksums. I can't see any obvious candidates in the tag list but it would be useful if you could get us the output of --xml as @pmatilai suggested above. That way we could compare it to what we get on a Fedora-built version of that package.

nwalfield commented 8 months ago

My built rpm doesn't udnerstand --xml:

us@alice:/tmp/rpm/rpm/b$ ./rpm -qp --xml ./tests/rpmtests.dir/273/testing/build/RPMS/noarch/attrtest-1.0-1.noarch.rpm
rpm: --xml: unknown option

Instead I've attached the generated rpm file (found in tests/rpmtests.dir/273/testing/build/RPMS/noarch). If that is not enough, let me know.

attrtest-1.0-1.noarch.rpm.zip

dmnks commented 8 months ago

Thanks! Meanwhile, I did a build of RPM in an Ubuntu 22.04 LTS container and got the exact same package as you, the diff follows:

diff --git a/fedora.xml b/ubuntu.xml
index c4f4df7..6546d2e 100644
--- a/fedora.xml
+++ b/ubuntu.xml
@@ -6,14 +6,14 @@
        <integer>5925</integer>
   </rpmTag>
   <rpmTag name="Sigmd5">
-       <base64>iNEDdobtP19rZ2GLAsxH7w==
+       <base64>mNM0PRkFKXQ5LtOJ4SH0+A==
 </base64>
   </rpmTag>
   <rpmTag name="Sha1header">
-       <string>a42c611d67870c1937623f0da2631eabdf33e948</string>
+       <string>8036a9b66aa7781e4000a441e695bb076acfc450</string>
   </rpmTag>
   <rpmTag name="Sha256header">
-       <string>eb512d3d8c282d0249701032591c53ffb5904c54c95de04783028387b224d8fe</string>
+       <string>94d13620f7058c14f24605c1461a9ef89b5b50b80c421a0a0eb7f0c62fe0f638</string>
   </rpmTag>
   <rpmTag name="Name">
        <string>attrtest</string>
@@ -502,12 +502,12 @@
        <string>utf-8</string>
   </rpmTag>
   <rpmTag name="Payloaddigest">
-       <string>116ce41ebb72f1877cda3d7dedaf5b78770e202d6389ade4e415d78548d703a8</string>
+       <string>91438332ac8fe92e4d4fcd45edb64b659323b893d9496a339f8587d19d00531a</string>
   </rpmTag>
   <rpmTag name="Payloaddigestalgo">
        <integer>8</integer>
   </rpmTag>
   <rpmTag name="Payloaddigestalt">
-       <string>116ce41ebb72f1877cda3d7dedaf5b78770e202d6389ade4e415d78548d703a8</string>
+       <string>91438332ac8fe92e4d4fcd45edb64b659323b893d9496a339f8587d19d00531a</string>
   </rpmTag>
 </rpmHeader>

So apparently the checksums are affected by the running OS, although all the other tags are equivalent.

dmnks commented 8 months ago
rpm: --xml: unknown option

This is because --xml is a popt alias and you'd have to set RPM_POPTEXEC_PATH to where the rpmpopt* file is installed, similarly to how atlocal.in does it :smile:

pmatilai commented 8 months ago

Okay, the headers are otherwise identical so it's the payload that has to differ then, and that payloaddigest changing throws everything else off too. And indeed, extracting the rpm2cpio output shows some differences (from diff -u -a output), eg:

-07070100000011000041e80000000300000000000000014e09198000000000000000000000000000000000000000000000000800000000./i/dir07070100000012000089ed0000000000000004000000014e0919800000000f000000000000000000000000000000000000000900000000./i/fileThis is file i
+07070100000011000041e80000000000000000000000014e09198000000000000000000000000000000000000000000000000800000000./i/dir07070100000012000089ed0000000000000004000000014e0919800000000f000000000000000000000000000000000000000900000000./i/fileThis is file i

What that difference actually is and why it happens, no clue. Rpm writes its own cpio so this isn't due to some cpio utility version differences.

pmatilai commented 8 months ago

Looking at cpio -t output is proving more helpful :monocle_face:

This from the Ubuntu-created file:

[pmatilai🎩︎localhost tmp]$ cpio -tv < out |head -4 6 blocks drwx------ 1 root root 0 Jun 28 2011 ./a/dir -r-------- 1 root root 15 Jun 28 2011 ./a/file drwx------ 1 bin adm 0 Jun 28 2011 ./b/dir -r-------- 1 bin adm 15 Jun 28 2011 ./b/file

And this on Fedora:

[pmatilai🎩︎localhost noarch]$ cpio -tv < out |head -4 6 blocks drwx------ 1 root root 0 Jun 28 2011 ./a/dir -r-------- 1 root root 15 Jun 28 2011 ./a/file drwx------ 1 daemon adm 0 Jun 28 2011 ./b/dir -r-------- 1 daemon adm 15 Jun 28 2011 ./b/file

I bet uid 2 is "bin" instead on Ubuntu:

[pmatilai🎩︎localhost ~]$ grep ^daemon /etc/passwd daemon:x:2:2:daemon:/sbin:/sbin/nologin

...yep (containers FTW):

root@bd27a9d083dc:/# grep ^bin /etc/passwd bin:x:2:2:bin:/bin:/usr/sbin/nologin

In 4.18.x we still do some name->uid lookups in the build even though all that supposedly is unused. I guess not...

pmatilai commented 8 months ago

So this is actually another bug fixed by a0553eb38a01772254cd48fef7ad116294cf801a

dmnks commented 8 months ago

I've just tried this with the latest RPM snapshot on master and this checksum test still fails on Ubuntu, even with commit a0553eb38a01772254cd48fef7ad116294cf801a in place. This time, though, the payload is identical (as confirmed with rpm2cpio and diff). Strange...

pmatilai commented 8 months ago

Mmm, but with current master the test would be running on Fedora because there's no native test-suite for Ubuntu? Those matryoshkas are really out to get us now.

dmnks commented 8 months ago

OK, I was a bit vague above, so to clarify:

What I did was:

  1. Ran an Ubuntu-based container (with toolbox)
  2. Installed all the RPM deps in it
  3. Built the latest RPM checkout in it
  4. Created an image from it (with podman commit)
  5. Ran the test-suite against that image (instead of the default Fedora one)

Note that this is currently not supported in our test-suite on master so I used my draft #2830 patch here.

TL;DR: I did run the test-suite on Ubuntu, it's just not supported out-of-the-box in our test-suite right now (pending in #2830) :smile: