teemtee / tmt

Test Management Tool
MIT License
76 stars 112 forks source link

Inconsistency in version 1.32.1 with ostree installations #2858

Closed sbertramrh closed 3 weeks ago

sbertramrh commented 1 month ago

In trying to execute ltp-lite tests on an ostree system, consistently in the prepare stage, tmt uses dnf to install the required packages and so it fails.

11:18:18 tmt version: 1.32.1
...
11:19:10             cmd: rpm -q --whatprovides rpm-build rng-tools rsyslog procmail bzip2 automake util-linux /usr/bin/flock libcap-devel ntpsec strace wget python3 bc autoconf gcc numactl libaio-devel bison numactl-devel flex || dnf install -y  rpm-build rng-tools rsyslog procmail bzip2 automake util-linux /usr/bin/flock libcap-devel ntpsec strace wget python3 bc autoconf gcc numactl libaio-devel bison numactl-devel flex
...
11:19:12             out: Extra Packages for Enterprise Linux 9 - aarch64 1.6 MB/s | 1.6 kB     00:00
11:19:12             err: Importing GPG key 0x3228467C:
11:19:12             err:  Userid     : "Fedora (epel9) <epel@fedoraproject.org>"
11:19:12             err:  Fingerprint: FF8A D134 4597 106E CE81 3B91 8A38 72BF 3228 467C
11:19:12             err:  From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-9
11:19:12             err: error: can't create transaction lock on /usr/share/rpm/.rpm.lock (Read-only file system)
11:19:12             err: Key import failed (code 2). Failing package is: libbsd-0.12.2-1.el9.aarch64
11:19:12             err:  GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-9
11:19:12             err: Public key for libmd-1.1.0-1.el9.aarch64.rpm is not installed. Failing package is: libmd-1.1.0-1.el9.aarch64
11:19:12             err:  GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-9
11:19:12             out: The downloaded packages were saved in cache until the next successful transaction.
11:19:12             err: Public key for ntpsec-1.2.2a-1.el9.aarch64.rpm is not installed. Failing package is: ntpsec-1.2.2a-1.el9.aarch64
11:19:12             out: You can remove cached packages by executing 'dnf clean packages'.
11:19:12             err:  GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-9
11:19:12             err: Error: GPG check FAILED

Using 1.31.0 with the same test against the same environment works without issue.

lukaszachy commented 1 month ago

There is difference in package manager fact - on 1.28.0 we detect rpm-ostree but 1.32.1 detect dnf.

Did we change the order in which rpm-ostree vs dnf is evaluated? That images has both installed but tmt should pick rpm-ostree...

lukaszachy commented 1 month ago

parts of log.txt of 1.28

18:00:59         out: aarch64
18:00:59         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; uname -r' on guest 'FQDN'.
18:00:59         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp40hhfi4x root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; uname -r'
18:00:59         environment: None
18:00:59         out: <uname-r>.aarch64
18:00:59         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; stat /run/ostree-booted' on guest 'FQDN'.
18:00:59         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp40hhfi4x root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; stat /run/ostree-booted'
18:00:59         environment: None
18:01:00         out:   File: /run/ostree-booted
18:01:00         out:   Size: 201           Blocks: 8          IO Block: 4096   regular file
18:01:00         out: Device: 17h/23d   Inode: 1975        Links: 1
18:01:00         out: Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
18:01:00         out: Context: system_u:object_r:var_run_t:s0
--SNIP--
18:01:00         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; cat /proc/filesystems' on guest 'FQDN'.
18:01:00         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp40hhfi4x root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; cat /proc/filesystems'
18:01:00         environment: None
18:01:00         out: nodev sysfs
--SNIP--
18:01:00         out: nodev rpc_pipefs
18:01:00         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; whoami' on guest 'FQDN'.
18:01:00         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp40hhfi4x root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-321/plans/default/data; export TMT_TREE=/var/tmp/tmt/run-321/plans/default/tree; export TMT_VERSION=1.28.0; whoami'
18:01:00         environment: None
18:01:01         out: root
18:01:01         arch: aarch64
18:01:01         distro: Red Hat .....
18:01:01         kernel: <uname -r>
18:01:01         package manager: rpm-ostree
lukaszachy commented 1 month ago

part of 1.32.1 log.txt

17:59:58         out: aarch64
17:59:58         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; uname -r' on guest 'FQDN'.
17:59:58         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp5qagkg9f root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; uname -r'
17:59:58         environment
17:59:58         out: <uname-r>
17:59:58         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; apk --version' on guest 'FQDN'.
17:59:58         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp5qagkg9f root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; apk --version'
17:59:58         environment
17:59:58         err: bash: line 1: apk: command not found
17:59:58         Command returned '127'.
17:59:58         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; apt --version' on guest 'FQDN'.
17:59:58         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp5qagkg9f root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; apt --version'
17:59:58         environment
17:59:58         err: bash: line 1: apt: command not found
17:59:58         Command returned '127'.
17:59:58         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; dnf --version' on guest 'FQDN'.
17:59:58         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp5qagkg9f root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; dnf --version'
17:59:58         environment
18:00:00         out: 4.14.0
18:00:00         out:   Installed: dnf-0:4.14.0-9.el9.noarch at Thu Apr 11 22:32:32 2024
18:00:00         out:   Built    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> at Thu Oct 26 05:20:14 2023
18:00:00         out: 
18:00:00         out:   Installed: rpm-....
18:00:00         out:   Built    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla> at Wed Dec 13 18:40:07 2023
18:00:00         Execute command 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; cat /proc/filesystems' on guest 'FQDN'.
18:00:00         Run command: sshpass -p XXX ssh -oForwardX11=no -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=60 -oServerAliveCountMax=5 -oIdentitiesOnly=yes -oPasswordAuthentication=yes -S/run/user/17062/tmt/tmp5qagkg9f root@FQDN 'export TMT_PLAN_DATA=/var/tmp/tmt/run-320/default/plan/data; export TMT_PLAN_ENVIRONMENT_FILE=/var/tmp/tmt/run-320/default/plan/data/variables.env; export TMT_TREE=/var/tmp/tmt/run-320/default/plan/tree; export TMT_VERSION=1.32.1; cat /proc/filesystems'
18:00:00         environment
--SNIP--
18:00:00         package manager: dnf
lukaszachy commented 1 month ago

@happz if you could take a look

happz commented 1 month ago

Hmm, that does indeed look like a problem with the changed order. It shouldn't matter, but it does :(

happz commented 1 month ago

What provision plugin and image did you use? We do have a test that runs prepare/install against a fedora-coreos VM, https://github.com/teemtee/tmt/blob/main/tests/prepare/install/test.sh#L32. I will dig into it, because I would expect it to fail from time to time should the problem be even remotely reproducible. Maybe the image used by @sbertramrh and @lukaszachy is slightly different, more prone to the failure.

sbertramrh commented 1 month ago

What provision plugin and image did you use? We do have a test that runs prepare/install against a fedora-coreos VM, https://github.com/teemtee/tmt/blob/main/tests/prepare/install/test.sh#L32. I will dig into it, because I would expect it to fail from time to time should the problem be even remotely reproducible. Maybe the image used by @sbertramrh and @lukaszachy is slightly different, more prone to the failure.

This was a rhel/rhivos image on the target board, not VM.

lukaszachy commented 1 month ago

From what I found out: fedora coreos doesn't have dnf installed. That machine has, even though it is rpmostree type.

happz commented 1 month ago

From what I found out: fedora coreos doesn't have dnf installed. That machine has, even though it is rpmostree type.

Yeah, that will be the problem, images I had for testing did not contain dnf, hence no problem.

I'm going to extend https://github.com/teemtee/tmt/blob/main/tests/prepare/install/test.sh with few more cases, including this one, if possible. This should be reproducible.

thrix commented 1 month ago

@happz @psss @lukaszachy could we consider cherry pick that to 1.32, this will break RHIVOS AFAIU :( and we cannot break them

lukaszachy commented 1 month ago

@thrix IMO it should be doable. And packit release might be easier to done than rpm only patch. We'll see once we merge the fix.

thrix commented 1 month ago

@lukaszachy another benefit of the copr :)

lukaszachy commented 1 month ago

Should we allow user to provide the package manager fact? It would workaround similar problems in the future (e.g. if machine has manager X installed for some reason but Y has to be used)

happz commented 1 month ago

Should we allow user to provide the package manager fact? It would workaround similar problems in the future (e.g. if machine has manager X installed for some reason but Y has to be used)

This is indeed a bug and should be fixed, I should have a patch ready today, /me hopes.

That said, yep, opening (some) facts up, and letting user force their value might be useful. Are there already other facts we might open together with the package manager? We should probably do it in a way that would allow us to open up additional facts, like "selinux" or "superuser".

BTW should we have some kind of priorities? If dnf5, dnf and yum are all installed on a guest, what should be the package manager used?

happz commented 1 month ago

Attempted a fix in https://github.com/teemtee/tmt/pull/2861. The bug should be reported by the new test which uses a custom CoreOS image with pre-installed dnf5 (https://github.com/teemtee/tmt/pull/2861/files#diff-b934e59dfcc3fe535c15739bb6afa87ed8600bd913005bf866960229edbfd115 + https://github.com/teemtee/tmt/pull/2861/files#diff-73ef46a5e318f561d83a865ab25a3cefe3fce01585c4e49f4b53ae194f84c608R182)

happz commented 1 month ago

Without priorities, the test fails as expected:

DEBUG    tmt:log.py:723 Run command: podman exec 710f6de30e6fada86ca77b5632266d20dee46c78a239bdf1764574ab0940dfc1 /bin/bash -c 'dnf5 --version'
DEBUG    tmt:log.py:723 out: dnf5 version 5.1.17
DEBUG    tmt:log.py:723 out: dnf5 plugin API version 1.0
DEBUG    tmt:log.py:723 out: libdnf5 version 5.1.17
DEBUG    tmt:log.py:723 out: libdnf5 plugin API version 1.0
DEBUG    tmt:log.py:723 Run command: podman exec 710f6de30e6fada86ca77b5632266d20dee46c78a239bdf1764574ab0940dfc1 /bin/bash -c 'yum --version'
DEBUG    tmt:log.py:723 err: /bin/bash: line 1: yum: command not found
DEBUG    tmt:log.py:723 Run command: podman exec 710f6de30e6fada86ca77b5632266d20dee46c78a239bdf1764574ab0940dfc1 /bin/bash -c '/bin/bash -c '"'"'stat /etc/ostree-booted || type rpm-ostree'"'"''
DEBUG    tmt:log.py:723 err: stat: cannot statx '/etc/ostree-booted': No such file or directory
DEBUG    tmt:log.py:723 out: rpm-ostree is /usr/bin/rpm-ostree
DEBUG    tmt:log.py:723 Discovered package managers: dnf5 and rpm-ostree
    def _test_discovery(expected: str) -> None:
        guest.facts.sync(guest)

>       assert guest.facts.package_manager == expected
E       AssertionError: assert 'dnf5' == 'rpm-ostree'
E         - rpm-ostree
E         + dnf5