Open KaleidonKep99 opened 1 year ago
Looking at the choose_primary_gpu_unchecked
function in the mutter code-base, it seems that it will use the boot VGA device by default, or an arbitrary device if none of them have that attribute.
However, it also looks like you can add a "mutter-device-preferred-primary" udev tag to force it to use a particular device. See https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1562
Looking at the
choose_primary_gpu_unchecked
function in the mutter code-base, it seems that it will use the boot VGA device by default, or an arbitrary device if none of them have that attribute.However, it also looks like you can add a "mutter-device-preferred-primary" udev tag to force it to use a particular device. See https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1562
Hi. Thank you for your response. I tried adding a udev tag, but it does not seem to make a difference, the system still tries to do everything on the 750 Ti first. The primary boot VGA device is indeed the 1660 Super, since no displays are attached to the 750 Ti, and I also do see the boot screen on the former, but I did notice that the UEFI firmware reports the GOP from the 750 Ti and not from the 1660 Super.
Looking at the lspci output, the 750 Ti seems to be on bus 23:00.0, while the 1660 Super is on bus 2d:00.0. This means that the 750 Ti gets priority when loading the firmware, since it is connected to the chipset, which is the first thing that gets initialized on boot. Could that be the issue?
Could we perhaps test that theory by simply swapping the two cards?
That indeed fixes the issue.
Now the issue is GNOME ignoring the mutter primary setting…
I’ll try some stuff in the meantime. Maybe I missed a crucial step while making the udev rule.
I don't know what's wrong, it seems like I'm doing everything properly, yet my setting is ignored. I am now trying to force the rendering to be on the 750 Ti, and I moved my screens to it as well, but it still renders on the 1660 Super, which is connected to the PCIe x16 slot of the chipset.
Here's the udev rule being applied at boot, I checked with udevadm and it reports the right values:
P: /devices/pci0000:00/0000:00:03.1/0000:2d:00.0/drm/card1
M: card1
R: 1
U: drm
T: drm_minor
D: c 226:1
N: dri/card1
L: 0
S: dri/by-path/pci-0000:2d:00.0-card
E: DEVPATH=/devices/pci0000:00/0000:00:03.1/0000:2d:00.0/drm/card1
E: DEVNAME=/dev/dri/card1
E: DEVTYPE=drm_minor
E: MAJOR=226
E: MINOR=1
E: SUBSYSTEM=drm
E: USEC_INITIALIZED=8723126
E: ID_PATH=pci-0000:2d:00.0
E: ID_PATH_TAG=pci-0000_2d_00_0
E: NVME_HOST_IFACE=none
E: ID_FOR_SEAT=drm-pci-0000_2d_00_0
E: DEVLINKS=/dev/dri/by-path/pci-0000:2d:00.0-card
E: TAGS=:mutter-device-preferred-primary:uaccess:seat:master-of-seat:
E: CURRENT_TAGS=:mutter-device-preferred-primary:uaccess:seat:master-of-seat:
Yet inxi -Fzx
still reports the 1660 Super as the main renderer, even with no displays attached to it.
Graphics:
Device-1: NVIDIA TU116 [GeForce GTX 1660 SUPER] vendor: Micro-Star MSI
driver: nvidia v: 530.41.03 arch: Turing bus-ID: 23:00.0
Device-2: NVIDIA GM107 [GeForce GTX 750 Ti] vendor: Gigabyte
driver: nvidia v: 530.41.03 arch: Maxwell bus-ID: 2d:00.0
...
Display: wayland server: X.Org v: 22.1.9 with: Xwayland v: 22.1.9
compositor: gnome-shell driver: gpu: nvidia,nvidia-nvswitch
resolution: 1920x1080~60Hz
API: OpenGL v: 4.6.0 NVIDIA 530.41.03 renderer: NVIDIA GeForce GTX 1660
SUPER/PCIe/SSE2 direct-render: Yes
The only other thing I can think of would be to apply the tag to the render node (/dev/dri/renderDXXX) instead of or in addition to the primary node (/dev/dri/card1).
If that doesn't work, it might be worth bringing this up with the GNOME devs. They would probably be able to provide more informed guidance.
Oh yeah, I should also mention that the Failed to allocate fence signaling event
error message is safe to ignore. Also it should be gone with the latest 535 driver.
The only other thing I can think of would be to apply the tag to the render node (/dev/dri/renderDXXX) instead of or in addition to the primary node (/dev/dri/card1).
If that doesn't work, it might be worth bringing this up with the GNOME devs. They would probably be able to provide more informed guidance.
Oh yeah, I should also mention that the
Failed to allocate fence signaling event
error message is safe to ignore. Also it should be gone with the latest 535 driver.
I'll try applying it to RenderD129 instead then. I'll get back with the results asap.
Nothing, same error:
Jun 01 16:37:08 [redacted] systemd[2148]: Started dbus-:1.2-org.gnome.Nautilus@1.service.
Jun 01 16:37:08 [redacted] nautilus[4838]: Connecting to org.freedesktop.Tracker3.Miner.Files
Jun 01 16:37:09 [redacted] gnome-shell[2329]: meta_window_set_stack_position_no_sync: assertion 'window->stack_position >= 0' failed
Jun 01 16:37:09 [redacted] gnome-shell[2329]: WL: error in client communication (pid 4838)
Jun 01 16:37:09 [redacted] nautilus[4838]: Error flushing display: Protocol error
Jun 01 16:37:09 [redacted] systemd[2148]: Started dbus-:1.2-org.gnome.DiskUtility@1.service.
Jun 01 16:37:09 [redacted] systemd[2148]: dbus-:1.2-org.gnome.Nautilus@1.service: Main process exited, code=exited, status=1/FAILURE
Jun 01 16:37:09 [redacted] systemd[2148]: dbus-:1.2-org.gnome.Nautilus@1.service: Failed with result 'exit-code'.
I'll forward the issue to the GNOME devs.
EDIT: https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/6734
Hello. I am having an issue that is closely related to issue #78.
I am using GNOME 44 under Fedora 38, with the latest NVIDIA drivers from RPMFusion. My computer has two GPUs; the first one is a 1660 Super, which is connected to the first PCIe slot and handles all of my screens, while the second one is a 750 Ti, which I mainly use for small CUDA workloads and for encoding on OBS on Windows.
Since I was thinking about moving from Windows to Linux, I decided to give Fedora a try. I installed it, got the NVIDIA drivers installed through RPMFusion, and it restarted fine. I noticed though that most of the apps wouldn't start up, instead showing the edges of the windows for a split second before disappearing. I tried switching to X11 and that did fix the issue, but since my main screen runs at a high refresh rate, switching to it would mean having the UI locked at 60Hz. I switched back to Wayland, and following the log output from
journalctl -f
while running one of the applications that crash, I see this error:Firefox seems to give out more info, claiming that more than one GPU from the same vendor was detected via PCI.
Checking
inxi -Fzx
, I see that Wayland is running on the system with no GPUs connected to it.I then proceeded to disable the 750 Ti manually, by doing
sudo nvidia-smi drain -p 0000:23:00.0 -m 1
, and the output frominxi
changed to this:Weirdly enough though, all the applications that kept crashing earlier, now work fine. Checking with
nvidia-smi
, they also seem to be rendering on the right GPU with all the screens connected to it:My question is, is there a way to force Wayland to use a specific GPU as the main one? Having to disable the 750 Ti means losing my secondary device for CUDA/encoding, which I need for specific workloads.
Full specs of my computer: AMD Ryzen 5900X @ 5GHz MSI MPG X570 Gaming Edge WiFi NVIDIA GeForce GTX 1660 Super NVIDIA GeForce GTX 750 Ti NVIDIA driver 3:530.41.03-1.fc38 Fedora 38 Workstation w/ GNOME 44
Installed Wayland packages: