Open NeonDaniel opened 1 year ago
Observed permissions of /dev/dri
and /dev/render*
as a possible cause of GUI errors. Reminded of this on Matrix
In a recent failure, the devices look normal:
(venv) neon@neon:~$ ll /dev/dr*
total 0
drwxr-xr-x 3 root root 120 Jan 26 13:48 ./
drwxr-xr-x 17 root root 3980 Jan 26 13:48 ../
drwxr-xr-x 2 root root 100 Jan 26 13:48 by-path/
crw-rw---- 1 root video 226, 0 Jan 26 13:48 card0
crw-rw---- 1 root video 226, 1 Jan 26 13:48 card1
crw-rw---- 1 root render 226, 128 Jan 26 13:48 renderD128
Example of gui-shell logs in a failure case. The leading and ending errors are seen every time the GUI fails to launch when the screen is properly initialized with /dev/dri
populated.
Jan 26 13:48:44 neon ovos-shell[554]: Failed to move cursor on screen DSI1: -13
Jan 26 13:48:44 neon ovos-shell[554]: Failed to move cursor on screen DSI1: -13
Jan 26 13:48:46 neon ovos-shell[554]: kf.kirigami: The style does not provide a C++ Units implementation. QML Units implementations are no longer suppor>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/main.qml:334:9: QML FastBlur: Cannot anchor to an item that isn't a parent or sibling.
Jan 26 13:48:46 neon ovos-shell[554]: QMetaProperty::notifySignal: cannot find the NOTIFY signal usePTTClient in class GlobalSettings for property 'useP>
Jan 26 13:48:46 neon ovos-shell[554]: mycroft connection not open!
Jan 26 13:48:46 neon ovos-shell[554]: mycroft connection not open!
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/panel/quicksettings/MuteDelegate.qml:55:5: QML Connections: Implicitly defined onFoo properties in Connection>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/panel/quicksettings/VolumeSlider.qml:61:5: QML Connections: Implicitly defined onFoo properties in Connection>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/panel/quicksettings/BrightnessSlider.qml:37:5: QML Connections: Implicitly defined onFoo properties in Connec>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/panel/SlidingPanel.qml:47:9: QML Connections: Implicitly defined onFoo properties in Connections are deprecat>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/NotificationsSystem.qml:45:5: QML Connections: Implicitly defined onFoo properties in Connections are depreca>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/NotificationsSystem.qml:22:5: QML Connections: Implicitly defined onFoo properties in Connections are depreca>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/ListenerAnimation.qml:18:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecate>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/qml/SkillView.qml:63:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. U>
Jan 26 13:48:46 neon ovos-shell[554]: qrc:/osd/VolumeOSD.qml:40:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. U>
Jan 26 13:48:47 neon ovos-shell[554]: qrc:/StatusIndicator.qml:165:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated>
Jan 26 13:48:47 neon ovos-shell[554]: qrc:/ServiceWatcher.qml:35:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. >
Jan 26 13:48:47 neon ovos-shell[554]: error activating kdeconnectd: QDBusError("org.freedesktop.DBus.Error.Disconnected", "Not connected to D-Bus server>
Jan 26 13:48:47 neon ovos-shell[554]: error activating kdeconnectd: QDBusError("org.freedesktop.DBus.Error.Disconnected", "Not connected to D-Bus server>
Jan 26 13:48:47 neon ovos-shell[554]: kdeconnect.interfaces: dbus interface not valid
Jan 26 13:48:47 neon ovos-shell[554]: file:///usr/lib/aarch64-linux-gnu/qt5/qml/QMLTermWidget/QMLTermScrollbar.qml:29:5: QML Connections: Implicitly def>
Jan 26 13:48:47 neon ovos-shell[554]: qrc:/main.qml:88:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this s>
Jan 26 13:48:47 neon ovos-shell[554]: qrc:/main.qml:72:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this s>
Jan 26 13:48:47 neon ovos-shell[554]: qrc:/main.qml:62:5: QML Connections: Implicitly defined onFoo properties in Connections are deprecated. Use this s>
Jan 26 13:48:47 neon ovos-shell[554]: Failed to commit atomic request (code=-13)
Another different error:
Mar 20 11:45:40 neon systemd[1]: Started gui-shell.service - Neon GUI.
Mar 20 11:45:41 neon ovos-shell[4137]: drmModeGetResources failed (Operation not supported)
Mar 20 11:45:41 neon ovos-shell[4137]: no screens available, assuming 24-bit color
Mar 20 11:45:41 neon ovos-shell[4137]: Cannot create window: no screens available
Mar 20 11:45:41 neon systemd[1]: gui-shell.service: Main process exited, code=killed, status=6/ABRT
Mar 20 11:45:41 neon systemd[1]: gui-shell.service: Failed with result 'signal'.
When the GUI service fails to launch, it appears that tty sessions also fail. ctrl
+alt
+F2
appears to change to a new session but there is no prompt (completely black screen). ctrl
+alt
+F1
resumes an active static screen.
Looking at systemd logs, typed input is being handled, just not rendered on-screen.
initramfs.log
in either casedmesg
or raspinfo
outputsWith debug
set in cmdline.txt, the following are present in a WORKING boot dmesg
output but not a broken one:
[ 19.166018] (udev-worker)[282]: drm: Processing device (SEQNUM=1759, ACTION=add)
[ 19.173827] (udev-worker)[281]: 8250: Processing device (SEQNUM=1760, ACTION=add)
This was also present in a subsequent broken boot
This may be related to scripts/init-bottom/udev
in the initramfs
Working udev has additional:
S: disk/by-path/platform-fd500000.pcie-pci-0000:01:00.0-usb-0:1:1.0-scsi-0:0:0:0-part1
and working initramfs has additional:
brcm-pcie fd500000.pcie: clkreq control enabled
Broken udev has additional:
│ │ ├─gpio/gpio22
│ │ │ ┆ P: /devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio22
│ │ │ ┆ M: gpio22
│ │ │ ┆ R: 22
│ │ │ ┆ U: gpio
│ │ │ ┆ E: DEVPATH=/devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio22
│ │ │ ┆ E: SUBSYSTEM=gpio
│ │ ├─gpio/gpio23
│ │ │ ┆ P: /devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio23
│ │ │ ┆ M: gpio23
│ │ │ ┆ R: 23
│ │ │ ┆ U: gpio
│ │ │ ┆ E: DEVPATH=/devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio23
│ │ │ ┆ E: SUBSYSTEM=gpio
│ │ ├─gpio/gpio24
│ │ │ ┆ P: /devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio24
│ │ │ ┆ M: gpio24
│ │ │ ┆ R: 24
│ │ │ ┆ U: gpio
│ │ │ ┆ E: DEVPATH=/devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio24
│ │ │ ┆ E: SUBSYSTEM=gpio
│ │ └─gpio/gpio25
│ │ ┆ P: /devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio25
│ │ ┆ M: gpio25
│ │ ┆ R: 25
│ │ ┆ U: gpio
│ │ ┆ E: DEVPATH=/devices/platform/soc/fe200000.gpio/gpiochip0/gpio/gpio25
│ │ ┆ E: SUBSYSTEM=gpio
Another different error:
Mar 20 11:45:40 neon systemd[1]: Started gui-shell.service - Neon GUI. Mar 20 11:45:41 neon ovos-shell[4137]: drmModeGetResources failed (Operation not supported) Mar 20 11:45:41 neon ovos-shell[4137]: no screens available, assuming 24-bit color Mar 20 11:45:41 neon ovos-shell[4137]: Cannot create window: no screens available Mar 20 11:45:41 neon systemd[1]: gui-shell.service: Main process exited, code=killed, status=6/ABRT Mar 20 11:45:41 neon systemd[1]: gui-shell.service: Failed with result 'signal'.
This one appears to be because /dev/dri/card0
is not deterministic and will sometimes link platform-gpu-card
instead of platform-fec00000.v3d.card
. Will remove QT_QPA_EGLFS_KMS_CONFIG
to allow auto-detection
Reported as also affecting HDMI displays on the forum
As suggested by Claude, I refactored service dependencies to start gui-shell after the modprobe@drm
service completed but the issue persists. Comparing outputs from systemd-analyze plot
there does not appear to be any pattern common to failed boots.
Anecdotally, systemd-binfmt.service
appears to consistently take longer (~1s vs ~94ms) in the failure cases but does exit successfully in both cases
Description
Occasionally, the GUI fails to load and even local
tty
sessions are not rendered on screen.fbi
calls work normally and allfb
anddri
devices exist as expectedSteps to Reproduce
Relevant Code
The shutdown service does explicitly blank the screen; perhaps power on needs to explicitly power on the screen?
Other Notes