openSUSE / SUSEPrime

Provide nvidia-prime like package for openSUSE
64 stars 11 forks source link

prime-select.service fails to start #83

Closed bakuserid closed 2 years ago

bakuserid commented 2 years ago

My laptop is taking an extremely long time to boot. The casue seems to be that the prime-select.service times out and fails to start. Here are the lines in which prime appears:

░░ Subject: A start job for unit prime-select.service has begun execution
░░ A start job for unit prime-select.service has begun execution.
Aug 06 14:12:25 16ITH6-openSUSE suse-prime[1518]: Boot: setting-up nvidia card
Aug 06 14:12:55 16ITH6-openSUSE systemd[1]: prime-select.service: start operation timed out. Terminating.
Aug 06 14:12:55 16ITH6-openSUSE systemd[1]: prime-select.service: Main process exited, code=killed, status=15/TERM
░░ An ExecStart= process belonging to unit prime-select.service has exited.
Aug 06 14:12:56 16ITH6-openSUSE systemd[1]: prime-select.service: Failed with result 'timeout'.
░░ The unit prime-select.service has entered the 'failed' state with result 'timeout'.
░░ Subject: A start job for unit prime-select.service has failed
░░ A start job for unit prime-select.service has finished with a failure.
Aug 06 14:12:56 16ITH6-openSUSE systemd[1]: prime-select.service: Consumed 2.182s CPU time.
░░ The unit prime-select.service completed and consumed the indicated resources.

Some background:

I am using a Lenovo Legion 5I Pro [16ITH6] with an Nvidia GeForce RTX-3050. I recently upgraded from the G05 series of Nvidia driver and related packages to the G06 series of Nvidia driver and related packages. Prior to this, this change SUSEPrime was working acceptably (except when using bbswitch which casused the entire laptop to poweroff).

When initially setting up SUSEPrime I followed instructions regarding items that should be in /etc/modprobe.d/, /etc/dracut.d/ and /etc/udev/rules.d/. Even though these have not been updated for the G06 series, they should be still valid.

In /etc/modprobe.d/09-nvidia-modprobe-pm-G05.conf, I have

options nvidia NVreg_DynamicPowerManagement=0x02

-- modified from the original for fine internal Runtime D3 powwer control.

In /etc/dracut.conf.d/90-nvidia-dracut-G05.conf, I have

omit_drivers+=" nvidia nvidia-drm nvidia-modeset nvidia-uvm bbswitch "

and in /etc/udev/rules.d/90-nvidia-udev-pm-G05.rules

# Remove NVIDIA USB xHCI Host Controller devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c0330", ATTR{remove}="1"

# Remove NVIDIA USB Type-C UCSI devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x0c8000", ATTR{remove}="1"

# Remove NVIDIA Audio devices, if present
ACTION=="add", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x040300", ATTR{remove}="1"

# Enable runtime PM for NVIDIA VGA/3D controller devices on driver bind
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="auto"
ACTION=="bind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="auto"

# Disable runtime PM for NVIDIA VGA/3D controller devices on driver unbind
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", TEST=="power/control", ATTR{power/control}="on"
ACTION=="unbind", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", TEST=="power/control", ATTR{power/control}="on"

Here are the contents of `/etc/systemd/system/multi-user.target.wants/prime-select.service

[Unit]
Description=SUSEPrime systemd service
Before=display-manager.service 

[Service]
Type=oneshot
ExecStart=/usr/sbin/prime-select systemd_call
TimeoutSec=30

[Install]
WantedBy=multi-user.target

and the services status

× prime-select.service - SUSEPrime systemd service
     Loaded: loaded (/usr/lib/systemd/system/prime-select.service; enabled; vendor preset: disabled)
     Active: failed (Result: timeout) since Sat 2022-08-06 14:12:56 EDT; 46min ago
    Process: 1352 ExecStart=/usr/sbin/prime-select systemd_call (code=killed, signal=TERM)
   Main PID: 1352 (code=killed, signal=TERM)
        CPU: 2.182s

Aug 06 14:12:25 16ITH6-openSUSE systemd[1]: Starting SUSEPrime systemd service...
Aug 06 14:12:55 16ITH6-openSUSE systemd[1]: prime-select.service: start operation timed out. Terminating.
Aug 06 14:12:55 16ITH6-openSUSE systemd[1]: prime-select.service: Main process exited, code=killed, status=15/TERM
Aug 06 14:12:56 16ITH6-openSUSE systemd[1]: prime-select.service: Failed with result 'timeout'.
Aug 06 14:12:56 16ITH6-openSUSE systemd[1]: Failed to start SUSEPrime systemd service.
Aug 06 14:12:56 16ITH6-openSUSE systemd[1]: prime-select.service: Consumed 2.182s CPU time.

The loaded nvidia modules:

nvidia_drm             73728  0
nvidia_modeset       1146880  1 nvidia_drm
nvidia_uvm           2715648  0
nvidia              40849408  2 nvidia_uvm,nvidia_modeset
nvidia_wmi_ec_backlight    16384  0
wmi                    45056  3 nvidia_wmi_ec_backlight,wmi_bmof,ideapad_laptop

The issue seems to be worse when using the kernel parameter nvidia.prime=offload at boot (SUSEPrime is configured for default nvidia mode).

Could you please address this issue. I would be happy to provide additional needed information that would help diagnose this issue.

Thanks,

BAK

sndirsch commented 2 years ago

Hmm. Could it be that you updated right from G05 to G06 driver packages? Unfortunately this does not work. Also it has conflicts you need to ignore to do it neverthless. So if you have done so, please remove G06 driver packages and reinstall them completely.

I'm not aware of a prime kernel driver option for nvidia module. Not sure where you've found this.

sndirsch commented 2 years ago

I'm not aware of a prime kernel driver option for nvidia module. Not sure where you've found this.

Such an option does not exist.

options nvidia NVreg_DynamicPowerManagement=0x02

It's not our default, since I've seen issues on some systems with it and then changed it to 0x01.

sndirsch commented 2 years ago

Could you add here the output when running prime-select log-view?

bakuserid commented 2 years ago

Thanks for your responses.

I think you are right. I think I installed G05 packages without first removing G05 packages. I haven't uninstalled and then installed G06 packages, but the problem seems to be gone now after a few updates. Maybe it was a Plymouth related problem.

I am using -- or trying to use -- the NVreg_DynamicPowerManagement driver module setting because it is documented in the "Driver Settings" section of the Nvidia README for Linux: Chapter 22. PCI-Express Runtime D3 (RTD3) Power Management. It seems to me that the value of this parameter will set the granularity of the driver's power management.

The SUSEPrime package itself does set this parameter in the file /etc/modprobe.d/09-nvidia-modprobe-pm-G05.conf, but it sets it to 0x01 by default. I simply changed it to 0x02, per the Nvidia documentation to get better power management. I know this parameter value would not be valid in older (pre Touring) Nvidia cards on Linux, but I have a more recent GPU.

Besides the Nvidia documentation and SUSEPrime's use of this parameter, I also believe this is a valid parameter because the similar-to-SUSEPrime Optimus Manager manipulates this parameter, by placing options nvidia NVreg_DynamicPowerManagement=0x02 in one of the files in /etc/modprobe.d/, the kernel parameter value depending on one of its own configuration file parameters This nvidia kernel parameter option is set in (I think, I am not on Arch now) /etc/modprobe.d/nvidia.conf and there are no problems there.

Here is the ouput of prime-select log-view


##SUSEPrime logfile##
[ 22:24:10 ] user_logout_waiter: started
[ 22:42:23 ] service restored by user
[ 22:46:32 ] Boot: forcing booting with nvidia, boot preference ignored
[ 22:46:34 ] updated /home/brook/.config/kdeglobals
[ 22:46:34 ] NVIDIA card correctly set
[ 22:46:34 ] HotSwitch: completed!
[ 00:51:53 ] Boot: setting-up nvidia card
[ 17:27:29 ] Boot: setting-up nvidia card
[ 17:35:27 ] Boot: setting-up nvidia card
[ 17:39:01 ] user_logout_waiter: started
[ 17:39:13 ] user_logout_waiter: X restart detected, preparing switch to offload
[ 17:39:16 ] Adding support for NVIDIA Prime Render Offload
[ 17:39:16 ] Intel card correctly set
[ 17:39:16 ] HotSwitch: starting Display Manager
[ 17:39:16 ] HotSwitch: completed!
[ 23:05:57 ] Boot: setting-up offload card
[ 01:04:21 ] user_logout_waiter: started
[ 22:38:29 ] Boot: setting-up offload card
[ 16:20:57 ] Boot: setting-up offload card
[ 15:27:10 ] Boot: setting-up offload card
[ 16:18:46 ] Boot: setting-up offload card
[ 20:35:02 ] Boot: nvidia.prime=offload kernel parameter detected!
[ 20:35:02 ] Boot: setting-up offload card
[ 21:25:55 ] Boot: nvidia.prime=offload kernel parameter detected!
[ 21:25:55 ] Boot: setting-up offload card
[ 16:36:50 ] Boot: setting-up offload card
[ 14:16:13 ] Boot: setting-up offload card
[ 22:28:39 ] Boot: setting-up offload card
[ 18:42:13 ] Boot: setting-up offload card
[ 18:46:20 ] Boot: setting-up offload card
[ 20:07:13 ] Boot: setting-up offload card
[ 20:11:47 ] Boot: nvidia.prime=offload kernel parameter detected!
[ 20:11:47 ] Boot: setting-up offload card
[ 19:07:14 ] Boot: setting-up offload card
[ 19:14:38 ] Boot: setting-up offload card
[ 19:50:46 ] Boot: setting-up nvidia card
[ 21:28:51 ] Boot: nvidia.prime=offload kernel parameter detected!
[ 21:28:51 ] Boot: setting-up offload card
[ 21:33:06 ] Boot: setting-up nvidia card
[ 21:37:10 ] user_logout_waiter: started
[ 21:37:24 ] user_logout_waiter: X restart detected, preparing switch to offload
[ 21:37:26 ] Adding support for NVIDIA Prime Render Offload
[ 21:37:26 ] Intel card correctly set
[ 21:37:26 ] HotSwitch: starting Display Manager
[ 21:37:27 ] HotSwitch: completed!
[ 11:30:46 ] Boot: setting-up nvidia card
[ 13:34:49 ] Boot: setting-up nvidia card
[ 13:44:55 ] user_logout_waiter: started
[ 13:48:31 ] Boot: setting-up nvidia card
[ 16:13:44 ] Boot: setting-up nvidia card
[ 16:13:46 ] updated /home/brook/.config/kdeglobals
[ 16:13:46 ] NVIDIA card correctly set
[ 16:13:46 ] HotSwitch: completed!
[ 00:32:04 ] Boot: nvidia.prime=offload kernel parameter detected!
[ 00:32:04 ] Boot: setting-up offload card
[ 16:40:46 ] Boot: setting-up nvidia card
[ 20:30:41 ] Boot: setting-up nvidia card
[ 14:05:57 ] Boot: nvidia.prime=offload kernel parameter detected!
[ 14:05:57 ] Boot: setting-up offload card
[ 14:12:25 ] Boot: setting-up nvidia card
[ 20:11:58 ] user_logout_waiter: started
[ 15:34:48 ] Boot: setting-up nvidia card
[ 15:34:51 ] updated /home/brook/.config/kdeglobals
[ 15:34:51 ] NVIDIA card correctly set
[ 15:34:51 ] HotSwitch: completed!
[ 14:38:30 ] Boot: setting-up nvidia card
[ 14:38:32 ] updated /home/brook/.config/kdeglobals
[ 14:38:32 ] NVIDIA card correctly set
[ 14:38:32 ] HotSwitch: completed!
(END)
sndirsch commented 2 years ago

Ok. Looks good so far now. Of course you can also set NVreg_DynamicPowerManagement=0x02 if it's stable for you. You were totally right by setting nvidia.prime=offload as kernel boot option. This option is being evaluated by the startup script. And It's also documented. I think we can close this now as WORKSFORME with the fresh installation of G06 driver packages.