coreos / fedora-coreos-tracker

Issue tracker for Fedora CoreOS
https://fedoraproject.org/coreos/
264 stars 59 forks source link

No serial console login after update to 40.20240616.3.0 #1758

Closed mhymny closed 2 months ago

mhymny commented 3 months ago

Describe the bug

Since the latest update onto 40.20240616.3.0, as the getty systemd service starts, nothing is written to serial console making it impossible to login. However the boot process and initial kernel output is correctly displayed.

Rollback to 40.20240602.3.0 fixed the issue, turned off automatic updates for now.

Reproduction steps

I haven't had time yet to test this behavior, but will do as soon as possible:

I don't know whether the problem persists with a fresh installation of 40.20240616.3.0.

Expected behavior

Login prompt is printed onto serial console

Actual behavior

System stops printing onto serial console

System details

Butane or Ignition config

No response

Additional information

This happened to all of my virtual machines using fcos.

jlebon commented 3 months ago

I can reproduce this. It doesn't even need to be an updating system. Just booting 40.20240616.3.0 directly exhibits the bug.

I actually hit this locally recently with a rawhide build that I had, but thought it was related to something I was testing.

We clearly need a test to verify that serial login works. It's also normally naturally tested by developers hacking on FCOS, but we also often use SSH instead.

jlebon commented 3 months ago

Filed https://bugzilla.redhat.com/show_bug.cgi?id=2296652 (edit: closed it as dupe of https://bugzilla.redhat.com/show_bug.cgi?id=2290482).

Workarounds mentioned there:

This can be worked around with enforcing=0 or reverting to selinux-policy-40.20-1.fc40.noarch using e.g.:

rpm-ostree override replace https://bodhi.fedoraproject.org/updates/FEDORA-2024-8c0636295a

Let's pin to the older version for now: https://github.com/coreos/fedora-coreos-config/pull/3056

Should also discuss if we want to fast-track this pin so it's part of the next stable. Added meeting label.

travier commented 3 months ago

~Maybe we can also do https://bugzilla.redhat.com/show_bug.cgi?id=2290482#c39~ Probably not as this hardcodes the name of the serial/console device.

travier commented 3 months ago

Another potential workaround (untested), if you know the tty device name, is to pass the following kernel argument to force the unit start:

systemd.wants=serial-getty@ttyS0.service
c4rt0 commented 3 months ago

This was discussed at the Fedora meeting yesterday. The summary:

We will fast-track the selinux-policy rollback to stable and write up documentation for affected users regarding the no serial console login

jbtrystram commented 3 months ago

Not sure but may be related, while investigating something related I saw in the journal logs :

Jul 12 21:03:31.093196 serial-getty@ttyAMA0.service[1712]: failed to open credentials directory
Jul 12 21:03:31.094510 getty@tty1.service[1711]: failed to open credentials directory
mhymny commented 3 months ago

Stable, Testing and Next have been released with the fix.

marmijo commented 3 months ago

The fix for this went into the following releases:

Please try out the releases and report any issues.

jlebon commented 2 months ago

We "fixed" it by pinning to an older selinux but we still need to eventually unpin.

Based on https://bugzilla.redhat.com/show_bug.cgi?id=2290482#c72, https://bodhi.fedoraproject.org/updates/FEDORA-2024-995d585c91 claims to fix this issue. Can someone do a build with that package and verify that it's indeed fixed? And if so, open a PR to revert https://github.com/coreos/fedora-coreos-config/pull/3056.

c4rt0 commented 2 months ago

I can confirm, that with the latest selinux-policy this issue is resolved:

[  OK  ] Finished systemd-user-sessions.service - Permit User Sessions.
[  OK  ] Started getty@tty1.service - Getty on tty1.
[  OK  ] Started serial-getty@ttyS0.service - Serial Getty on ttyS0.
[  OK  ] Reached target getty.target - Login Prompts.
[  OK  ] Reached target multi-user.target - Multi-User System.
         Starting systemd-update-utmp-runle…- Record Runlevel Change in UTMP...
         Starting zincati.service - Zincati Update Agent...
[  OK  ] Finished systemd-update-utmp-runle…e - Record Runlevel Change in UTMP.
[  OK  ] Started zincati.service - Zincati Update Agent.

Fedora CoreOS 40.20240810.dev.0
Kernel 6.9.12-200.fc40.x86_64 on an x86_64 (ttyS0)

SSH host key: SHA256:6gXf5O5OaxAmc0mTQeYnupRXND3eMXnfCaicjR+4YzM (ED25519)
SSH host key: SHA256:COEp2E9u5l9qfVhw+iRmsus20AjLYFAu3NjYyu2QvAc (ECDSA)
SSH host key: SHA256:fvUq3d1RzV3XHrleEXIy0KjYwSdgQDoxGq8OK34LrOg (RSA)
ens4: 10.0.2.15 fe80::91e7:e094:96e7:dd5
Ignition: ran on 2024/08/10 13:49:52 UTC (this boot)
Ignition: user-provided config was applied
No SSH authorized keys provided by Ignition or Afterburn
cosa-devsh login: core (automatic login)

Fedora CoreOS 40.20240810.dev.0
[core@cosa-devsh ~]$ rpm -qi selinux-policy
Name        : selinux-policy
Version     : 40.27
Release     : 1.fc40
Architecture: noarch
Install Date: Sat Aug 10 13:41:06 2024
Group       : Unspecified
Size        : 29316
License     : GPL-2.0-or-later
Signature   : (none)
Source RPM  : selinux-policy-40.27-1.fc40.src.rpm
Build Date  : Wed Aug  7 10:17:39 2024
Build Host  : buildvm-s390x-06.s390.fedoraproject.org
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://github.com/fedora-selinux/selinux-policy
Bug URL     : https://bugz.fedoraproject.org/selinux-policy
Summary     : SELinux policy configuration
Description :
SELinux core policy package.
Originally based off of reference policy,
the policy has been adjusted to provide support for Fedora.
[core@cosa-devsh ~]$
jlebon commented 2 months ago

Unpin and fast-track in https://github.com/coreos/fedora-coreos-config/pull/3080 and https://github.com/coreos/fedora-coreos-config/pull/3082.