ewagner12 / all-ways-egpu

Configure eGPU as primary under Linux Wayland desktops
MIT License
193 stars 12 forks source link

eGPU isn't initialized properly when using GNOME's "Automatic Login" option #23

Closed oschl-git closed 4 months ago

oschl-git commented 4 months ago

I recently installed a new Arch system with disk encryption. In order to not have to input two passwords on boot, I enabled GNOME's "Automatic Login" option. This option essentially skips GDM altogether and boots me right into the desktop, which is ideal as I'm the only person using this laptop.

With this option however, the eGPU isn't properly initialized: my desktop loads before it gets detected, so GNOME seems to hotplug it. My monitors load up, but there is a drastic difference in performance. Logging out to GDM and logging back in fixes it and I can use the eGPU like I'm used to.

I don't know if this is something that can be fixed by all-ways-egpu, or if there's any solution I can do myself (like loading some kernel module) but it would be fantastic if it had a solution.

My setup: Laptop: ThinkPad T14s Gen 4 (AMD Ryzen 7, 32GB RAM, Radeon 780M) eGPU: Radeon RX 6900XT enclosure: TH3P4G3

Let me know if there's anything else (logs, etc) I can provide.

ewagner12 commented 4 months ago

@oschl-git Yes there is a race condition between when the desktop loads and when the eGPU is initialized by the drivers. There are 2 ways I'd try to deal with it.

  1. Does Arch load the thunderbolt kernel module in the initramfs? See this page on the arch wiki (https://wiki.archlinux.org/title/Kernel_module#Early_module_loading) If the thunderbolt and amdgpu modules are loaded in the early init I believe you should be able to get the full disk encryption password screen on the eGPU monitors and so the eGPU should be detected earlier.
  2. If the above doesn't help then you can add a delay in the script. I haven't made this the default as delaying bootup isn't ideal, but with a standard install of the latest version of the script includes a file /usr/share/all-ways-egpu/max-retry and you can increase the number in this file to allow it more time to try and find the eGPU.
oschl-git commented 4 months ago

Thanks for the quick response. Unfortunately none of these solutions are working for me. :/

Adding loading the kernel modules to mkinitcpio seems to trigger the eGPU during the boot process, but along the way it gets disabled again, and then again enabled only after I see the GNOME desktop. I am not able to see the encrypt prompt on my external monitors either, despite the eGPU being active during this time. On the other hand, loading the thunderbolt module makes it possible to use an external keyboard connected to a Thunderbolt dock to input the encrypt password, so that's a major improvement for me, thanks.

Increasing the max-retry number doesn't appear to have any effect. I tried to increase it all the way to 30 and even rerun the install script again but it didn't appear to extend the boot time at all or change the behaviour in any way.

oschl-git commented 4 months ago

Okay, this is actually weirder than I thought.

The boot process (with amdgpu and thunderbolt modules in initramfs) goes like this:

  1. The laptop posts, loads GRUB. GRUB triggers the eGPU and lights up one of my monitors, but doesn't actually display the boot menu on it, the menu is displayed only on the laptop screen.
  2. GRUB starts booting the selected kernel, gets to the decryption prompt. On this prompt, the eGPU restarts periodically. It seems to be active most of the time, but every 10 seconds or so, it clicks and reactivates. None of my monitors are active during this time, but I can input the decrypt password on my external keyboard plugged into the Thunderbolt dock (which wasn't possible before I added the thunderbolt module to mkinitcpio).
  3. I input the password, the kernel starts booting. Along the way, the eGPU gets deactivated once more, until the boot process is almost finished, when it clicks and starts activating. The GNOME desktop shows up on my laptop screen for about 2 seconds before the eGPU gets detected and presumably hotplugged by GNOME, after which my laptop screen goes dark and the desktop appears on my two monitors.

So odd. It's not a big deal, I can always just relog, but it's not ideal.

oschl-git commented 4 months ago

Okay, I figured out a super simple hacky solution, but a solution nonetheless. I made a bash script:

#!/bin/bash

if lspci | grep 'Radeon RX 6800/6800 XT / 6900 XT'; then
    systemctl restart display-manager
fi

I then made a simple systemd service which runs the script with a 5 second delay after boot:

[Unit]
Description=Restart the display manager with a delay if an eGPU is connected
After=network.target

[Service]
Type=simple
ExecStartPre=/bin/sleep 5
ExecStart=/usr/local/bin/restartdisplayifegpu.sh

[Install]
WantedBy=default.target

And then I enabled it. The delay takes a few seconds longer but everything seems to be working well now.

ewagner12 commented 4 months ago

Good to hear you found a workaround. I might consider adding entry and exit points so its easier to run a custom command before or after the script runs if that seems like a useful idea.

oschl-git commented 4 months ago

Definitely sounds useful, thanks for your help and all your work on this project.