Closed dustymabe closed 2 years ago
Are we not getting the kola logs with gangplank?
Are we not getting the kola logs with gangplank?
Yep. They're on the Artifacts
tab in blue ocean view.
The most recent rawhide aarch64 run succeeded (36.20210901.91.0
). Maybe this was a weird issue with the 5.14.0-0.rc6.20210820gitd992fe5318d8.50.fc36
kernel that's now fixed in the non rc 5.14.0-61.fc36
. Will monitor.
we just got lucky.. issue is back in today's build
@dustymabe I'm not able to reproduce it locally, both on devel-next and rawhide config branch. Could it be that there is some io bottleneck on the infra side, the builder/runner? Although I'm on the machine that I would call really io bound as it is running from old 2.5'' laptop spinning HDD. Maybe the builder/runner is "too fast"? Could the test be bit racy in itself?
@jcajka in my testing I wasn't able to reproduce locally (rpi4) either. I was thinking it had to do with the hardware being faster but I've executed tests directly against the hardware and it seemed to pass the test too.
One thing that is really interesting is that it's consistently failing on rawhide, but consistently not failing on any other stream, so there's some clue in there.
More or less I think we really need to dig down into the exact command that's failing and add more information in COSA there so we can get more information than just Process exited with status 1
.
Hmm, not seeing kola artifacts in the Artifacts tab (https://jenkins-fedora-coreos.apps.ocp.ci.centos.org/blue/organizations/jenkins/multi-arch-pipeline/detail/multi-arch-pipeline/48/artifacts). Do you have another dump available?
Here's one that just happened today: https://jenkins-fedora-coreos.apps.ocp.ci.centos.org/blue/organizations/jenkins/multi-arch-pipeline/detail/multi-arch-pipeline/90/artifacts
Hmm, hard to tell what's going on there. I see the ostree admin unlock --hotfix
applying cleanly, but there are no obvious errors after that in the journal from rpm-ostree. It looks like there was an SSH attempt, but not seeing any attempt to even run rpm-ostree status --json
.
Maybe keep running it on the same hardware in a loop for a while with --ssh-on-test-failure
?
Maybe keep running it on the same hardware in a loop for a while with
--ssh-on-test-failure
?
tried that a week ago just running cosa directly on the hardware - no success
This has been passing the last few rounds of runs in the rawhide
multi-arch-pipeline for aarch64
.
Closing.. WIll re-open if it pops back up.
Seeing this again in latest aarch64 rawhide build.
Again not seeing anything from the rpm-ostree daemon. I think the failure here is happening earlier on. Something with SSH and/or PAM on overlayfs perhaps?
Seen in rawhide today:
[2021-10-15T15:09:45.905Z] --- FAIL: ostree.hotfix (182.71s)
[2021-10-15T15:09:45.905Z] --- FAIL: ostree.hotfix/unlock (64.71s)
[2021-10-15T15:09:45.905Z] cluster.go:144: error: Timeout was reached
[2021-10-15T15:09:45.905Z] cluster.go:167: "rpm-ostree status --json" failed: output , status Process exited with status 1
and.. it's consistent again
The snooze ended on 10/25/2021 and the most recent run of rawhide (36.20211027.91.0
) succeeded.
Will continue to monitor.
This is back in: 36.20211030.91.0
[2021-10-30T06:12:25.059Z] --- FAIL: ostree.hotfix (172.66s)
[2021-10-30T06:12:25.059Z] --- FAIL: ostree.hotfix/unlock (64.93s)
[2021-10-30T06:12:25.059Z] cluster.go:144: error: Timeout was reached
[2021-10-30T06:12:25.059Z] cluster.go:167: "rpm-ostree status --json" failed: output , status Process exited with status 1
Happening pretty consistently again.
This is back in 36.20211203.91.0
:
[2021-12-03T18:27:36.233Z] --- FAIL: ostree.hotfix (181.40s)
[2021-12-03T18:27:36.233Z] --- FAIL: ostree.hotfix/unlock (64.81s)
[2021-12-03T18:27:36.233Z] cluster.go:154: error: Timeout was reached
[2021-12-03T18:27:36.233Z] cluster.go:177: "rpm-ostree status --json" failed: output , status Process exited with status 1
Both the run and the re-run failed.
Now seeing this in testing-devel
. It happened for 35.20211210.20.0
(link).
The re-run succeeded. Full logs: ostree-hotfix.zip
Now that https://github.com/coreos/coreos-assembler/pull/2652 and https://github.com/coreos/coreos-assembler/pull/2655 are in COSA we have some new information for this ticket. multi-arch-pipeline#416 failed last night.
In both the original failure and the rerun retrying the rpm-ostree status --json
command succeeded:
2022-01-21T04:35:22Z kola/tests/util/rpmostree: rpm-ostree status succeeded after retries.
Here are the logs from the original test (filtered through grep -v audit
to remove audit noise):
Jan 21 04:33:41.708447 kola[1891]: === RUN: ostree.hotfix/unlock ===
Jan 21 04:33:41.765636 sshd[1885]: pam_unix(sshd:session): session closed for user core
Jan 21 04:33:41.823378 systemd-logind[974]: Session 7 logged out. Waiting for processes to exit.
Jan 21 04:33:41.849349 systemd[1]: session-7.scope: Deactivated successfully.
Jan 21 04:33:41.869491 systemd-logind[974]: Removed session 7.
Jan 21 04:33:42.305742 sshd[1904]: Accepted publickey for core from 10.0.2.2 port 40158 ssh2: ECDSA SHA256:03DhHHTOAQ1wL9ztWi2usB5fjOHp8iJyPYOFtmlbEBE
Jan 21 04:33:42.391975 systemd-logind[974]: New session 8 of user core.
Jan 21 04:33:42.409660 systemd[1]: Started session-8.scope - Session 8 of User core.
Jan 21 04:33:42.445684 sshd[1904]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:33:42.693463 kola[1910]: + sudo ostree admin unlock --hotfix
Jan 21 04:33:42.737075 sshd[1904]: pam_unix(sshd:session): session closed for user core
Jan 21 04:33:42.814418 systemd-logind[974]: Session 8 logged out. Waiting for processes to exit.
Jan 21 04:33:42.822801 systemd[1]: session-8.scope: Deactivated successfully.
Jan 21 04:33:42.843894 systemd-logind[974]: Removed session 8.
Jan 21 04:33:43.330889 sshd[1923]: Accepted publickey for core from 10.0.2.2 port 40160 ssh2: ECDSA SHA256:03DhHHTOAQ1wL9ztWi2usB5fjOHp8iJyPYOFtmlbEBE
Jan 21 04:33:43.443514 systemd-logind[974]: New session 9 of user core.
Jan 21 04:33:43.463165 systemd[1]: Started session-9.scope - Session 9 of User core.
Jan 21 04:33:43.500492 sshd[1923]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:33:43.830061 sudo[1942]: core : PWD=/var/home/core ; USER=root ; COMMAND=/usr/bin/ostree admin unlock --hotfix
Jan 21 04:33:43.855541 sudo[1942]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
Jan 21 04:33:44.061150 kernel: xfs filesystem being remounted at /sysroot supports timestamps until 2038 (0x7fffffff)
Jan 21 04:33:44.062112 kernel: EXT4-fs (vda3): re-mounted. Quota mode: none.
Jan 21 04:33:44.062465 kernel: ext4 filesystem being remounted at /boot supports timestamps until 2038 (0x7fffffff)
Jan 21 04:34:06.255134 kola[1943]: note: Deploying commit 2b91a3d1d5957eb7d73d5ff81aaf2c461259f7abfe0dff752df75be332041a2e which contains content in /var/lib that will be ignored.
Jan 21 04:34:12.224748 ostree[1944]: Copying /etc changes: 8 modified, 0 removed, 46 added
Jan 21 04:34:12.229208 kola[1943]: Copying /etc changes: 8 modified, 0 removed, 46 added
Jan 21 04:34:17.569918 ostree[1944]: Bootloader updated; bootconfig swap: yes; bootversion: boot.0.1, deployment count change: 1
Jan 21 04:34:17.574203 kola[1943]: Bootloader updated; bootconfig swap: yes; bootversion: boot.0.1, deployment count change: 1
Jan 21 04:34:22.164323 kola[1943]: Hotfix mode enabled. A writable overlayfs is now mounted on /usr
Jan 21 04:34:22.168335 kola[1943]: for this booted deployment. A non-hotfixed clone has been created
Jan 21 04:34:22.168397 kola[1943]: as the non-default rollback target.
Jan 21 04:34:22.266297 sudo[1942]: pam_unix(sudo:session): session closed for user root
Jan 21 04:34:22.349074 kernel: bash (1929) used greatest stack depth: 9552 bytes left
Jan 21 04:34:22.405344 sshd[1923]: pam_unix(sshd:session): session closed for user core
Jan 21 04:34:22.574611 systemd-logind[974]: Session 9 logged out. Waiting for processes to exit.
Jan 21 04:34:22.586670 systemd[1]: session-9.scope: Deactivated successfully.
Jan 21 04:34:22.605056 systemd[1]: session-9.scope: Consumed 33.264s CPU time.
Jan 21 04:34:22.632579 systemd-logind[974]: Removed session 9.
Jan 21 04:34:23.573170 sshd[2044]: Accepted publickey for core from 10.0.2.2 port 40162 ssh2: ECDSA SHA256:03DhHHTOAQ1wL9ztWi2usB5fjOHp8iJyPYOFtmlbEBE
Jan 21 04:34:23.722658 systemd-logind[974]: New session 10 of user core.
Jan 21 04:34:23.752327 systemd[1]: Started session-10.scope - Session 10 of User core.
Jan 21 04:34:23.811491 sshd[2044]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:34:49.763017 sshd[2044]: pam_unix(sshd:session): session closed for user core
Jan 21 04:34:49.885878 systemd-logind[974]: Session 10 logged out. Waiting for processes to exit.
Jan 21 04:34:49.910386 systemd[1]: session-10.scope: Deactivated successfully.
Jan 21 04:34:49.942332 systemd-logind[974]: Removed session 10.
Jan 21 04:34:50.853785 sshd[3075]: Accepted publickey for core from 10.0.2.2 port 40164 ssh2: ECDSA SHA256:03DhHHTOAQ1wL9ztWi2usB5fjOHp8iJyPYOFtmlbEBE
Jan 21 04:34:50.992853 systemd-logind[974]: New session 11 of user core.
Jan 21 04:34:51.022327 systemd[1]: Started session-11.scope - Session 11 of User core.
Jan 21 04:34:51.081112 sshd[3075]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:34:56.233541 rpm-ostree[1015]: In idle state; will auto-exit in 60 seconds
Jan 21 04:34:56.307739 systemd[1]: rpm-ostreed.service: Deactivated successfully.
Jan 21 04:34:56.326319 systemd[1]: rpm-ostreed.service: Consumed 47.898s CPU time.
Jan 21 04:34:56.373074 systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon...
Jan 21 04:34:56.558207 rpm-ostree[3303]: Reading config file '/etc/rpm-ostreed.conf'
Jan 21 04:35:13.650887 systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon.
Jan 21 04:35:13.673257 rpm-ostree[3303]: In idle state; will auto-exit in 60 seconds
Jan 21 04:35:13.684670 rpm-ostree[3303]: Allowing active client :1.38 (uid 1000)
Jan 21 04:35:13.696529 rpm-ostree[3303]: client(id:cli dbus:1.38 unit:session-11.scope uid:1000) added; new total=1
Jan 21 04:35:20.597806 chronyd[961]: Selected source 108.61.73.244 (2.fedora.pool.ntp.org)
Jan 21 04:35:22.295608 rpm-ostree[3303]: client(id:cli dbus:1.38 unit:session-11.scope uid:1000) vanished; remaining=0
Jan 21 04:35:22.298274 rpm-ostree[3303]: In idle state; will auto-exit in 61 seconds
Jan 21 04:35:22.361549 sshd[3075]: pam_unix(sshd:session): session closed for user core
Jan 21 04:35:22.473488 systemd-logind[974]: Session 11 logged out. Waiting for processes to exit.
Jan 21 04:35:22.502364 systemd[1]: session-11.scope: Deactivated successfully.
Jan 21 04:35:22.528332 systemd-logind[974]: Removed session 11.
Jan 21 04:35:23.001647 sshd[4297]: Accepted publickey for core from 10.0.2.2 port 40166 ssh2: ECDSA SHA256:03DhHHTOAQ1wL9ztWi2usB5fjOHp8iJyPYOFtmlbEBE
Jan 21 04:35:23.087857 systemd-logind[974]: New session 12 of user core.
Jan 21 04:35:23.106237 systemd[1]: Started session-12.scope - Session 12 of User core.
Jan 21 04:35:23.141178 sshd[4297]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:35:23.400454 kola[4303]: === DONE: ostree.hotfix/unlock ===
It looks as if the failed rpm-ostree status --json
call was associated with:
Jan 21 04:34:23.752327 systemd[1]: Started session-10.scope - Session 10 of User core.
Jan 21 04:34:23.811491 sshd[2044]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:34:49.763017 sshd[2044]: pam_unix(sshd:session): session closed for user core
Jan 21 04:34:49.885878 systemd-logind[974]: Session 10 logged out. Waiting for processes to exit.
Jan 21 04:34:49.910386 systemd[1]: session-10.scope: Deactivated successfully.
Jan 21 04:34:49.942332 systemd-logind[974]: Removed session 10.
So it seems to have timed out after 26 seconds.. Why?
In the successful call we see journal output from rpm-ostree:
Jan 21 04:34:50.992853 systemd-logind[974]: New session 11 of user core.
Jan 21 04:34:51.022327 systemd[1]: Started session-11.scope - Session 11 of User core.
Jan 21 04:34:51.081112 sshd[3075]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 04:34:56.233541 rpm-ostree[1015]: In idle state; will auto-exit in 60 seconds
Jan 21 04:34:56.307739 systemd[1]: rpm-ostreed.service: Deactivated successfully.
Jan 21 04:34:56.326319 systemd[1]: rpm-ostreed.service: Consumed 47.898s CPU time.
Jan 21 04:34:56.373074 systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon...
Jan 21 04:34:56.558207 rpm-ostree[3303]: Reading config file '/etc/rpm-ostreed.conf'
Jan 21 04:35:13.650887 systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon.
Jan 21 04:35:13.673257 rpm-ostree[3303]: In idle state; will auto-exit in 60 seconds
Jan 21 04:35:13.684670 rpm-ostree[3303]: Allowing active client :1.38 (uid 1000)
Jan 21 04:35:13.696529 rpm-ostree[3303]: client(id:cli dbus:1.38 unit:session-11.scope uid:1000) added; new total=1
Jan 21 04:35:20.597806 chronyd[961]: Selected source 108.61.73.244 (2.fedora.pool.ntp.org)
Jan 21 04:35:22.295608 rpm-ostree[3303]: client(id:cli dbus:1.38 unit:session-11.scope uid:1000) vanished; remaining=0
Jan 21 04:35:22.298274 rpm-ostree[3303]: In idle state; will auto-exit in 61 seconds
Jan 21 04:35:22.361549 sshd[3075]: pam_unix(sshd:session): session closed for user core
Jan 21 04:35:22.473488 systemd-logind[974]: Session 11 logged out. Waiting for processes to exit.
Jan 21 04:35:22.502364 systemd[1]: session-11.scope: Deactivated successfully.
Jan 21 04:35:22.528332 systemd-logind[974]: Removed session 11.
What state is rpm-ostree in that it doesn't respond to that first call?
Full logs: ostree-hotfix.zip
We don't have https://github.com/coreos/coreos-assembler/pull/2661 merged yet, but a run just failed that has a little more output that I can see:
Jan 21 22:36:27.605168 kola[1890]: === RUN: ostree.hotfix/unlock ===
Jan 21 22:36:27.655124 sshd[1884]: pam_unix(sshd:session): session closed for user core
Jan 21 22:36:27.731160 systemd-logind[974]: Session 7 logged out. Waiting for processes to exit.
Jan 21 22:36:27.756510 systemd[1]: session-7.scope: Deactivated successfully.
Jan 21 22:36:27.781563 systemd-logind[974]: Removed session 7.
Jan 21 22:36:28.247114 sshd[1903]: Accepted publickey for core from 10.0.2.2 port 41506 ssh2: ECDSA SHA256:xaTNcVEQ+eK0I9x68hXvZF8887KE5ZeGslExDxKBboU
Jan 21 22:36:28.334723 systemd-logind[974]: New session 8 of user core.
Jan 21 22:36:28.353964 systemd[1]: Started session-8.scope - Session 8 of User core.
Jan 21 22:36:28.391656 sshd[1903]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 22:36:28.650733 kola[1909]: + sudo ostree admin unlock --hotfix
Jan 21 22:36:28.712932 sshd[1903]: pam_unix(sshd:session): session closed for user core
Jan 21 22:36:28.781244 systemd-logind[974]: Session 8 logged out. Waiting for processes to exit.
Jan 21 22:36:28.791096 systemd[1]: session-8.scope: Deactivated successfully.
Jan 21 22:36:28.824004 systemd-logind[974]: Removed session 8.
Jan 21 22:36:28.945760 kernel: ssh-key-dir (1924) used greatest stack depth: 9184 bytes left
Jan 21 22:36:29.251114 sshd[1922]: Accepted publickey for core from 10.0.2.2 port 41508 ssh2: ECDSA SHA256:xaTNcVEQ+eK0I9x68hXvZF8887KE5ZeGslExDxKBboU
Jan 21 22:36:29.338077 systemd-logind[974]: New session 9 of user core.
Jan 21 22:36:29.355451 systemd[1]: Started session-9.scope - Session 9 of User core.
Jan 21 22:36:29.392239 sshd[1922]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 22:36:29.734106 sudo[1941]: core : PWD=/var/home/core ; USER=root ; COMMAND=/usr/bin/ostree admin unlock --hotfix
Jan 21 22:36:29.761387 sudo[1941]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=1000)
Jan 21 22:36:29.975664 kernel: xfs filesystem being remounted at /sysroot supports timestamps until 2038 (0x7fffffff)
Jan 21 22:36:29.976622 kernel: EXT4-fs (vda3): re-mounted. Quota mode: none.
Jan 21 22:36:29.976992 kernel: ext4 filesystem being remounted at /boot supports timestamps until 2038 (0x7fffffff)
Jan 21 22:36:56.457838 kola[1942]: note: Deploying commit a7bb16afc553e34c881ad353407f29a4546123b12bc3fcf778383b315002650e which contains content in /var/lib that will be ignored.
Jan 21 22:37:03.940746 ostree[1943]: Copying /etc changes: 8 modified, 0 removed, 46 added
Jan 21 22:37:03.945964 kola[1942]: Copying /etc changes: 8 modified, 0 removed, 46 added
Jan 21 22:37:09.712044 ostree[1943]: Bootloader updated; bootconfig swap: yes; bootversion: boot.0.1, deployment count change: 1
Jan 21 22:37:09.716226 kola[1942]: Bootloader updated; bootconfig swap: yes; bootversion: boot.0.1, deployment count change: 1
Jan 21 22:37:14.375743 kola[1942]: Hotfix mode enabled. A writable overlayfs is now mounted on /usr
Jan 21 22:37:14.378736 kola[1942]: for this booted deployment. A non-hotfixed clone has been created
Jan 21 22:37:14.378799 kola[1942]: as the non-default rollback target.
Jan 21 22:37:14.473295 sudo[1941]: pam_unix(sudo:session): session closed for user root
Jan 21 22:37:14.639880 sshd[1922]: pam_unix(sshd:session): session closed for user core
Jan 21 22:37:14.791279 systemd-logind[974]: Session 9 logged out. Waiting for processes to exit.
Jan 21 22:37:14.798954 systemd[1]: session-9.scope: Deactivated successfully.
Jan 21 22:37:14.818153 systemd[1]: session-9.scope: Consumed 36.988s CPU time.
Jan 21 22:37:14.842131 systemd-logind[974]: Removed session 9.
Jan 21 22:37:15.841926 sshd[2043]: Accepted publickey for core from 10.0.2.2 port 41510 ssh2: ECDSA SHA256:xaTNcVEQ+eK0I9x68hXvZF8887KE5ZeGslExDxKBboU
Jan 21 22:37:16.010194 systemd-logind[974]: New session 10 of user core.
Jan 21 22:37:16.042049 systemd[1]: Started session-10.scope - Session 10 of User core.
Jan 21 22:37:16.102028 sshd[2043]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 22:37:42.094550 sshd[2043]: pam_unix(sshd:session): session closed for user core
Jan 21 22:37:42.254468 systemd-logind[974]: Session 10 logged out. Waiting for processes to exit.
Jan 21 22:37:42.270454 systemd[1]: session-10.scope: Deactivated successfully.
Jan 21 22:37:42.299171 systemd-logind[974]: Removed session 10.
Jan 21 22:37:43.307868 sshd[3044]: Accepted publickey for core from 10.0.2.2 port 41514 ssh2: ECDSA SHA256:xaTNcVEQ+eK0I9x68hXvZF8887KE5ZeGslExDxKBboU
Jan 21 22:37:43.472868 systemd-logind[974]: New session 11 of user core.
Jan 21 22:37:43.500979 systemd[1]: Started session-11.scope - Session 11 of User core.
Jan 21 22:37:43.568478 sshd[3044]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 22:37:49.539853 rpm-ostree[1014]: In idle state; will auto-exit in 63 seconds
Jan 21 22:37:49.572680 rpm-ostree[1014]: Failed to GetConnectionUnixUser for client :1.35: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: The connection does not exist
Jan 21 22:37:49.623733 systemd[1]: rpm-ostreed.service: Deactivated successfully.
Jan 21 22:37:49.643976 systemd[1]: rpm-ostreed.service: Consumed 49.296s CPU time.
Jan 21 22:37:49.698138 systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon...
Jan 21 22:37:49.893018 rpm-ostree[3302]: Reading config file '/etc/rpm-ostreed.conf'
Jan 21 22:38:07.427577 systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon.
Jan 21 22:38:07.448470 rpm-ostree[3302]: In idle state; will auto-exit in 62 seconds
Jan 21 22:38:07.459717 rpm-ostree[3302]: Allowing active client :1.38 (uid 1000)
Jan 21 22:38:07.470568 rpm-ostree[3302]: client(id:cli dbus:1.38 unit:session-11.scope uid:1000) added; new total=1
Jan 21 22:38:16.544469 rpm-ostree[3302]: client(id:cli dbus:1.38 unit:session-11.scope uid:1000) vanished; remaining=0
Jan 21 22:38:16.550622 rpm-ostree[3302]: In idle state; will auto-exit in 64 seconds
Jan 21 22:38:16.671433 sshd[3044]: pam_unix(sshd:session): session closed for user core
Jan 21 22:38:16.750756 systemd-logind[974]: Session 11 logged out. Waiting for processes to exit.
Jan 21 22:38:16.814048 systemd[1]: session-11.scope: Deactivated successfully.
Jan 21 22:38:16.835026 systemd-logind[974]: Removed session 11.
Jan 21 22:38:17.364170 sshd[4296]: Accepted publickey for core from 10.0.2.2 port 41516 ssh2: ECDSA SHA256:xaTNcVEQ+eK0I9x68hXvZF8887KE5ZeGslExDxKBboU
Jan 21 22:38:17.461526 systemd-logind[974]: New session 12 of user core.
Jan 21 22:38:17.482461 systemd[1]: Started session-12.scope - Session 12 of User core.
Jan 21 22:38:17.524275 sshd[4296]: pam_unix(sshd:session): session opened for user core(uid=1000) by (uid=0)
Jan 21 22:38:17.806742 kola[4302]: === DONE: ostree.hotfix/unlock ===
This is the new interesting part:
Jan 21 22:37:49.572680 rpm-ostree[1014]: Failed to GetConnectionUnixUser for client :1.35: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: The connection does not exist
But IIUC that is actually the successful ssh connection. So maybe not relevant?
Now with more debugging (from https://github.com/coreos/coreos-assembler/pull/2661) mulit-arch-pipeline#13 failed.
Here's the bits between the blocks with grep -v audit
:
Full log: journal.txt
I think https://github.com/coreos/rpm-ostree/pull/3406 will fix this.
The snooze for ostree.hotfix
was extended in https://github.com/coreos/fedora-coreos-config/pull/1543.
The fix for this went into testing
stream release 35.20220313.2.0
. Please try out the new release and report issues.
The fix for this went into stable
stream release 35.20220313.3.1
.
The
ostree.hotfix
test is fairly consistently failing in our rawhide (20210827
content set) aarch64 pipelineExample run: (link) output:
After some investigation and comparing the journal from entries that succeed and ones that don't it seems it may be related to rpm-ostree auto-exiting.
In the success case:
sudo ostree admin unlock --hotfix
rpm-ostree status
-> successIn the failure case:
sudo ostree admin unlock --hotfix
rpm-ostree status
-> failureIOW in the failure case rpm-ostreed is still running when
rpm-ostree status
is called.