guysoft / FullPageOS

A raspberrypi distro to display a full page browser on boot
GNU General Public License v3.0
3.81k stars 232 forks source link

Failed to load lightdm at boot sporadically #388

Open OliverBissig opened 2 years ago

OliverBissig commented 2 years ago

Sometimes fullpageos does not boot correctly. About 1 of 10 times after the splash screen all gets black. I am able to connect to the Pi over SSH. Over SSH it is possible to restart lightdm again: sudo service lightdm restart

One possible solution by changing logind-check-graphical=true in /etc/lightdm/lightdm.conf did not work.

Version of FullPageOS?

2021-03-04-fullpageos-buster-armhf-lite-0.12.0

guysoft commented 2 years ago
  1. What version of RaspberryPi?
  2. Can ylu confirm you are using a good power supply amd cable for it?
  3. Does this happen with the nightly build?
OliverBissig commented 2 years ago
  1. Raspberry Pi 3+
  2. Yes. 3A.
  3. it also happens with build 2021-06-08_2021-05-07-fullpageos-buster-armhf-lite-0.13.0.zip

Log from lightdm [+0.00s] DEBUG: Logging to /var/log/lightdm/lightdm.log [+0.00s] DEBUG: Starting Light Display Manager 1.26.0, UID=0 PID=943 [+0.00s] DEBUG: Loading configuration dirs from /usr/share/lightdm/lightdm.conf.d [+0.00s] DEBUG: Loading configuration from /usr/share/lightdm/lightdm.conf.d/01_debian.conf [+0.00s] DEBUG: Loading configuration dirs from /usr/local/share/lightdm/lightdm.conf.d [+0.00s] DEBUG: Loading configuration dirs from /etc/xdg/lightdm/lightdm.conf.d [+0.00s] DEBUG: Loading configuration from /etc/lightdm/lightdm.conf [+0.00s] DEBUG: Registered seat module local [+0.00s] DEBUG: Registered seat module xremote [+0.00s] DEBUG: Registered seat module unity [+0.00s] DEBUG: Using D-Bus name org.freedesktop.DisplayManager [+0.11s] DEBUG: Monitoring logind for seats [+0.11s] DEBUG: New seat added from logind: seat0 [+0.11s] DEBUG: Seat seat0: Loading properties from config section Seat:* [+0.11s] DEBUG: Seat seat0: Starting [+0.11s] DEBUG: Seat seat0: Creating user session [+0.12s] WARNING: Error getting user list from org.freedesktop.Accounts: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Accounts was not provided by any .service files [+0.12s] DEBUG: Loading user config from /etc/lightdm/users.conf [+0.12s] DEBUG: User pi added [+0.13s] DEBUG: Seat seat0: Creating display server of type x [+0.13s] DEBUG: posix_spawn avoided (fd close requested) [+0.14s] DEBUG: posix_spawn avoided (fd close requested) [+0.16s] DEBUG: Seat seat0: Plymouth is running on VT 1, but this is less than the configured minimum of 7 so not replacing it [+0.16s] DEBUG: Quitting Plymouth [+0.16s] DEBUG: posix_spawn avoided (fd close requested) [+0.24s] DEBUG: Using VT 7 [+0.24s] DEBUG: Seat seat0: Starting local X display on VT 7 [+0.24s] DEBUG: XServer 0: Logging to /var/log/lightdm/x-0.log [+0.24s] DEBUG: XServer 0: Writing X server authority to /var/run/lightdm/root/:0 [+0.24s] DEBUG: XServer 0: Launching X Server [+0.24s] DEBUG: Launching process 970: /usr/bin/X :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch [+0.24s] DEBUG: XServer 0: Waiting for ready signal from X server :0 [+0.24s] DEBUG: Acquired bus name org.freedesktop.DisplayManager [+0.24s] DEBUG: Registering seat with bus path /org/freedesktop/DisplayManager/Seat0 [+3.32s] DEBUG: Got signal 10 from process 970 [+3.32s] DEBUG: XServer 0: Got signal from X server :0 [+3.32s] DEBUG: XServer 0: Connecting to XServer :0 [+3.47s] DEBUG: XServer 0: Error connecting to XServer :0 [+4.32s] DEBUG: Got signal 10 from process 970

I am Using a Waveshare 7inch DSI Display.

guysoft commented 2 years ago

This seems like the error:

GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.Accounts was not provided by any .service files

Someone here says its worth looking in journalctl.

https://bbs.archlinux.org/viewtopic.php?pid=1453505#p1453505

Its unlikely that this is an OS issue, because its being used by thousands of people. Could it be the SD card is corrupted?

mascheihei commented 2 years ago

OliverBissig, do you have any solution? I have the same behaviour, but I doubt that's depending on the GDBus Error. The same error shows up independent from running correctly or stopping after splash-screen. The lightdm is running but chromium not. And yes it is about 1 of 10 times

prawnhead commented 2 years ago

I'm having the same issue. I'm using FullPageOS to display a web dashboard at a fire station. The system will run for up to a week but Chromium crashes leaving a mouse pointer on a black desktop. When I connect over SSH I can see Chromium is not running. When I try to take a screen shot using scrot I get

$ DISPLAY=:0 scrot
Invalid MIT-MAGIC-COOKIE-1 keygiblib error: Can't open X display. It *is* running, yeah?

LightDM is running

$ ps -ax | grep light
  503 ?        Ss     0:08 /usr/sbin/lighttpd -D -f /etc/lighttpd/lighttpd.conf
 1305 ?        Ssl    0:00 /usr/sbin/lightdm
 1337 tty7     Ssl+   0:17 /usr/lib/xorg/Xorg :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
15644 pts/0    S+     0:00 grep --color=auto light

There are no Chromium processes running but I can use VNC to connect. Again, it's just a mouse pointer on a black screen. I am running this on two Pi 3+'s with 2021-04-14_2021-03-04-fullpageos-buster-armhf-lite-0.12.0 and they behave the same. I'm thinking of writing a script to check for running Chromium processes and, when none found, restarting the Pi. Any help appreciated.

samuelgregorovic commented 2 years ago

any news of this issue?

guysoft commented 2 years ago

AFAIK I can't reproduce it and no one has attached the output from journalctl which I asked for here: https://github.com/guysoft/FullPageOS/issues/388#issuecomment-968099864 Also no reply on testing the SD cards

jflkadsjfslkj commented 2 years ago

This problem is very reproducible. Multiple SD cards with 2022-03-17_2022-01-28-fullpageos-bullseye-armhf-lite-0.13.0 and RPi4 4GB Rev 1.5. This combination gives me 25% of hard boots resulting in the black screen and lightdm.log "Error connecting to XServer"

Strangely the same SD cards used in a RPi3B+ never(20 hard boots so far) makes a black screen with lightdm.log "Error connecting to XServer"

A soft boot (sudo reboot) during the black screen situation is a slow operation. Over 60 seconds before the boot messages start scrolling up the screen.

Here are the logs. Jorunalctl is filtered starting at boot date and time.

dmesg-black-desktop.txt dmesg-good-desktop.txt journalctl-black-desktop.txt journalctl-good-desktop.txt ps-list-black-desktop.txt ps-list-good-desktop.txt

OliverBissig commented 2 years ago

I have made a little workaround to solve this problem. I am checking the error log of lightdm at startup. if the desktop cannot start, it tries again: check_display.zip Code:

#!/bin/bash
if sudo grep -q "Error connecting to XServer" /var/log/lightdm/lightdm.log
then
    echo "Restart Display after Error"    
    sudo service lightdm restart
else
    echo "Display is allready running"
fi

Add Execution Rights to script: sudo chmod +x /path/to/check_display

Edit /etc/rc.local by adding: (/bin/sleep 15 && /path/to/check_display)&

jflkadsjfslkj commented 2 years ago

I added some logging (for Display OK and Display not OK) to the script from Oliver and tested manually. Logging works fine.

I added the sleep 15... line to the bottom of rc.local but the script never executes. No idea why.

jflkadsjfslkj commented 1 year ago

rc.local is thanks to systemd hobbled in bullseye. Enabling here https://blog.wijman.net/enable-rc-local-in-debian-bullseye/

the script above executes but the screen still stays black. I think my quick and dirty solution will be just put a reboot in the script

EDIT: rc.local has an error --dport option does not exist and the rest of the script won't execute, so comment it out

#/sbin/iptables -t mangle -I POSTROUTING 1 -o wlan0 -p udp --dport 123 -j TOS --set-tos 0x00

guysoft commented 1 year ago

You can add it to a systemd service similar to how vnc is started: https://github.com/guysoft/FullPageOS/blob/devel/src/modules/fullpageos/filesystem/root_init/etc/systemd/system/x11vnc.service

pesimeao commented 1 year ago

Hey guys, have you find out why the XServer fails in start? This doesn't seems to happen all the time. I'm building the image from the latest code. When this happens, I just have to start the service and everything works great again.

OmgItsBkid commented 1 year ago

This is happening to me as well with 2023-05-03-fullpageos-bullseye-armhf-lite-0.13.0. As with prawnhead above, I can VNC in but only have a black screen. At first I thought it was happening once I changed the password from raspberry to something else, but it seems to happen even with a fresh copy. Here is what I get when attempting to run start_gui before and after:

during a good start (chromium showing on screen):

pi@fullpageos:~/scripts $ ./start_gui
Another composite manager is already running
root window unavailable (maybe another WM is running?)
dpkg-query: no packages found matching bluealsa
Opening in existing browser session.
WARNING: v3dv is neither a complete nor a conformant Vulkan implementation. Testing use only.
dpkg-query: no packages found matching bluealsa
Opening in existing browser session.
WARNING: v3dv is neither a complete nor a conformant Vulkan implementation. Testing use only.

(despite the messages, you will see chromium re-launch)

during a bad start (no picture):

pi@fullpageos:~/scripts $ ./start_gui
Invalid MIT-MAGIC-COOKIE-1 keyxset:  unable to open display ":0"
Invalid MIT-MAGIC-COOKIE-1 keyxset:  unable to open display ":0"
Invalid MIT-MAGIC-COOKIE-1 keyxset:  unable to open display ":0"
Invalid MIT-MAGIC-COOKIE-1 keysession_init(): Can't open display.
Invalid MIT-MAGIC-COOKIE-1 keymatchbox: can't open display! check your DISPLAY variable.
Invalid MIT-MAGIC-COOKIE-1 keyError: Can't open display: (null)
Failed creating new xdo instance
dpkg-query: no packages found matching bluealsa
Invalid MIT-MAGIC-COOKIE-1 key[919:919:0711/192935.737772:ERROR:ozone_platform_x11.cc(239)] Missing X server or $DISPLAY
[919:919:0711/192935.737933:ERROR:env.cc(255)] The platform failed to initialize.  Exiting.

I've also attached lightdm logs for both situations, successful and unsuccessful.

Good lightdm.log Bad lightdm.log

guysoft commented 1 year ago

@OmgItsBkid might be a different issue. It sounda like;

  1. the browrser is not starting because its waiting for internet which does not start
  2. you are running gui commands withouy DISPLAY set to DISPLAY=:0
OmgItsBkid commented 12 months ago

I believe this is the same issue. I do have DISPLAY set to DISPLAY=:0 before running those commands, it just was not included in my comment.

However, those were just tests after the main issue occurs, and still doesn't explain why it randomly fails to start on boot and I get what you see in the lightdm logs.

DFoxinator commented 8 months ago

Are there any updates on this issue or any new workarounds?

guysoft commented 5 months ago

@DFoxinator Perhaps adding some loop in the start_gui script if it does not manage to open display might do it. I really don't know because it might even be a lightdm bug.