tuxedocomputers / tuxedo-tomte

Magic housekeeping package for TUXEDO books
https://www.tuxedocomputers.com/en/What-is-TUXEDO-Tomte.tuxedo
Other
35 stars 10 forks source link

Black screen after Tomte installed fixes #33

Closed ykarrde closed 3 months ago

ykarrde commented 5 months ago

Today after "Tomte installed Fixes..." and "Reboot necessary..." notifications the reboot ended with a black screen and nothing else working. Tried CTRL + ALT + F-Keys, to see if I can get to at least a command line, did not work either. Thankfully I managed to boot into a live system and use Timeshift to restore the state from yesterday.

After the network connection was reestablish the Tomte Fixes run again with exactly the same unfortunate result. Restored again, disabled my Wlan router temporarily and as a Emergency measure I disabled Tomte completely in the tuxedo control center for now.

Betriebssystem: TUXEDO OS 2 KDE-Plasma-Version: 5.27.10 KDE-Frameworks-Version: 5.114.0 Qt-Version: 5.15.12 Kernel-Version: 6.5.0-10031-tuxedo (64-bit) Grafik-Plattform: X11 Prozessoren: 32 × 13th Gen Intel® Core™ i9-13900HX Speicher: 62,5 GiB Arbeitsspeicher Grafikprozessor: NVIDIA GeForce RTX 4070 Laptop GPU/PCIe/SSE2 Hersteller: TUXEDO Produktname: TUXEDO Gemini Gen2 Systemversion: Not Applicable

Emohr-Tuxedo commented 5 months ago

Thank you for reporting this, sorry for the problems you are having, I will take a look into it right now.

Emohr-Tuxedo commented 5 months ago

I tried to reproduce your problem on the same model and the same parameters you gave me. I was not able to reproduce your problem. Tomte manages to update the system without problems.

This newest Tomte version updates your nvidia drivers from 535 to 550. I think your problem might lay there. Just an idea, you could try to update those drivers manually:

sudo apt install tuxedo-nvidia-driver-550

It is possible you might see the error right away. If this does not work, please contact the TUXEDO support https://www.tuxedocomputers.com/en/Contact.tuxedo and please tell them about this issue on github.

THX

ykarrde commented 5 months ago

I narrowed it down a little bit. Installing sudo apt install tuxedo-nvidia-driver-550 worked as as far as I could see but 550 (probably) caused the error.

And now it gets really weird:

In my, most of the time, "Docked"/Home setup i have two external Monitors one connected to the Display-port and the other one with HDMI. As soon as only one external Monitor is connected, does not matter which one, I can boot. I can even connect the second external Monitor after I booted and everything works like nothing ever happened with the exception that now a shutdown hangs and never finishes.
The end of the log for that shutdown, until I forced it off (unpluging one Monitor at that state didn't help), looked like this:

Apr 19 16:13:53 PCTUX2345 kernel: [drm:nv_drm_master_set [nvidia_drm]] *ERROR* [nvidia-drm] [GPU ID 0x00000100] Failed to grab modeset ownership
Apr 19 16:13:59 PCTUX2345 systemd-logind[1062]: Power key pressed.
Apr 19 16:14:02 PCTUX2345 systemd[1]: sddm.service: State 'stop-sigterm' timed out. Killing.
Apr 19 16:14:02 PCTUX2345 systemd[1]: sddm.service: Killing process 1371 (sddm) with signal SIGKILL.
Apr 19 16:14:02 PCTUX2345 systemd[1]: sddm.service: Killing process 1395 (QDBusConnection) with signal SIGKILL.
Apr 19 16:14:02 PCTUX2345 systemd[1]: sddm.service: Killing process 1606 (Xorg:traceq0) with signal SIGKILL.
Apr 19 16:14:02 PCTUX2345 systemd[1]: sddm.service: Main process exited, code=killed, status=9/KILL
Apr 19 16:14:02 PCTUX2345 systemd[1]: sddm.service: Killing process 1606 (Xorg:traceq0) with signal SIGKILL.
Apr 19 16:14:17 PCTUX2345 kernel: workqueue: nv_drm_handle_hotplug_event [nvidia_drm] hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
Apr 19 16:14:32 PCTUX2345 systemd[1]: sddm.service: Processes still around after final SIGKILL. Entering failed mode.
Apr 19 16:14:32 PCTUX2345 systemd[1]: sddm.service: Failed with result 'timeout'.
...
Apr 19 16:14:33 PCTUX2345 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c77d:0:0:1164
Apr 19 16:14:57 PCTUX2345 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:4:0:1173
Apr 19 16:15:02 PCTUX2345 systemd[1]: plymouth-poweroff.service: start-post operation timed out. Terminating.
Apr 19 16:15:02 PCTUX2345 systemd[1]: plymouth-poweroff.service: Failed with result 'timeout'.
Apr 19 16:15:02 PCTUX2345 systemd[1]: Failed to start Show Plymouth Power Off Screen.
Apr 19 16:15:02 PCTUX2345 systemd[1]: Starting Tell Plymouth To Jump To initramfs...
Apr 19 16:15:05 PCTUX2345 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:5:0:1173
Apr 19 16:15:13 PCTUX2345 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:6:0:1173
Apr 19 16:15:21 PCTUX2345 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c67e:7:0:1173
Apr 19 16:15:29 PCTUX2345 kernel: nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000c77d:0:0:1164

Edit: Rechecked because one of the first things I thought I tried yesterday was starting with the external Monitors turned off, and sure just turning one (or both) off does not help. Even cutting the Power to the (HDMI) Monitor did not work. I have to physically unplug the display cable to get this to work. Totally weird. I probably revert to 535 for now.

Emohr-Tuxedo commented 5 months ago

Yeah, looks like you found a bug in the Nvidia package. One last advice I could give you is to uninstall all the Nvidia packages and to install the tuxedo-nvidia-driver-550 package.

Find out which packages are installed:

dpkg -l nvidia*

Please use purge to uninstall the packages, like:

sudo apt-get remove --purge packagename

but even better, contact our support, you have a TUXEDO notebook, you bought the support with your notebook!

ykarrde commented 3 months ago

On a final note, before i click the close with comment button below.

After the Plasma 6 update also failed, and support could not really help, I decided to reinstall everything from scratch and it looks like Nvidia 550.67 , KDE Plasma 6 and Wayland are working now.

But I have not everything installed I usually use yet. Maybe, but hopefully not, I run into the / a issue along the way.

On a side note the tuxedo mirrors are awfully slow today, I thought not very much about the little slow WebFAI install but then the "Tomte installs fixes" took ages and the final Plasma 6 update was virtually impossible (claimed 3rd party repos as the failure reason, on a virgin Install, are you serious?) My connection good, (other) websites no visible slowdown. I had to raise the https timeout for apt drastically to get this working.

with regards

ykarrde commented 3 months ago

PS.: Scratch the Wayland part That started to act weirdly today.

And all I did yesterday was reinstalling my Browser and Mail profiles and arranged the "Monitors"/Desktop to my liking, so nothing major I thought.
After starting the Notebook this morning the external Monitors stayed black despite showing up as activated in the KDE settings. Then I removed the /etc/sddm.conf.d/10-wayland.conf file, because something about this file maybe causing problems in some cases was mentioned at the Plasma 6 install guide. That re-enabled the external Monitors but only in a Mirrored way. After rearranging them back and rebooting they stayed black again.

I'm back at X11now and hope that X11 at least stays stable otherwise I will probably reopen this again.

It's a shame because I was so excited to finally utilize the per monitor scaling and HDR on my bigger main Monitor, will not happen with X11.