Open Thaodan opened 8 years ago
Ok. Yeah. So you know your custum version is getting loaded. I guess you can try
systemctl restart bumblebeed.service
to see if it made anything better.
That at least turns the dGPU back off (and enables me to manually switch it on and off again), but I still can't use primus/optirun.
Latest kernel (4.8.7-1 on Manjaro) fixed this issue on my case. Edit: it did not, now Im using the workaround provided previously
Latest kernel (4.8.7-1 on Manjaro) fixed this issue on my case.
I can confirm that for me, the problem persists with 4.8.7 on Gentoo Linux.
There's also nothing in the (vanilla) kernel logs about any power management related changes, so I don't see how a kernel update could change the issue (the only "fix" would be if the newer kernel reverted the change, or for some reason power management got broken in the update).
pcie_port_pm=off
again works around the issue.
Nvidia just released a new set of drives (375.20). Can anyone report back if it solved the issue? http://www.nvidia.com/download/driverResults.aspx/111596/en-us
I think no cause the issue is part how Optimus is handled. For the bumblebee way using the gpu bbswitch needs to extended (see pm rewrite for a good start).
For the bumblebee way using the gpu bbswitch needs to extended (see pm rewrite for a good start).
Agreed - but I still think nvidia needs to fix something, too, since even if bumblebee is disabled and bbswitch is blacklisted, if the nvidia card is not actively used on boot and thus the kernel disables the PCI-port, the nvidia driver itself will not reactivate the port if modprobing it (as I have described in the nvidia forums). Depending on the timing during boot (activating nvidia-persistenced of course might help...), I believe this also prevents prime etc.
I haven't tested 375.20 yet, though.
I have just tested with nvidia 375.20 and the issue is certainly fixed for me. I am running arch with kernel 4.8.8-2-ARCH on a Dell XPS 9550. I have bbswitch loaded and I am not passing 'pcie_port_pm=off' to my kernel (which I used before to work around this issue.)
I'm not sure if the fix is with kernel 4.8.8 or with the nvidia drivers but if anybody would like me to test anything I'd be happy to oblige.
I can confirm that this issue is fixed with me as well with the 375.20 drivers. I switched from the "managed" to "unmanaged" Fedora repo and downloaded the 375.20 drivers. They installed perfectly on kernel 4.8.8 with the default bbswitch version from the repo (not the pm_rework version). Everything works normally now, including optirun and primusrun.
Edit: I was mistaken, I did still have pcie_port_pm=off in my cmdline. I tried removing it, and primus/optirun stopped working, so you do still need the workaround, but at least bumblebee works again.
For fun, I tried using the pm_rework version of bbswitch to see if it would work without the pcie_port_pm=off workaround, but it did not work.
For me, bottom line is that drivers 375.20, default repo bbswitch, and pcie_port_pm=off now works in full.
Due to this, I think this may have been two separate problems that occurred concurrently. First, kernel 4.8 requires the pcie pm workaround. Second, for primus/optirun to work it requires the 375.20 drivers (possibly because 375.20 adds support for xorg 1.19, and Fedora updated us to xorg 1.19 around the same time as kernel 4.8).
@seekermoc Thanks a lot for the info. I will try to update the managed version tomorrow. Sorry for the delay. Sometimes I miss these nvidia updates. I'm still on fedora 23.
@gsgatlin thank you for your work! do you think you'll have the repo updated for fedora 25 when it comes out in a few days?
@gsgatlin No problem, thanks for all your help.
@kmare Yes. I will update everything (centos 6,7.fedora 23,24,25,26) at the same time. I still need to test fedora 25 though.
Just to confirm the general picture: With 375.20, I can still reproduce the original problem (unless I add pcie_port_pm=off
). After all, 375.20 claims only to have fixed the (independent) issue of incompatiblity with Xorg 1.19 (which I don't use yet in any case).
While it's not explicitly listed in the changelog, could the new driver update 375.26 have fixed the problem mentioned here? Has anyone tried it?
Just to confirm than this also happen on the latest stable nvidia 375.26 on kernel 4.9. Once I start the pc I can use the dedicated card without any problem. The way to reproduce this bug on my laptop is just to close the lid (send it to sleep) and start using the pc again. Now bumblebee cant use the nvidia card.
Hi all,
I'm still having this problem with 4.9 and 375.26 on openSUSE Tumbleweed. If I'm forcing the kernel to switch back the PM method the laptop starts up into rl3 and freezes before I can login.
The Laptop is a DELL Inspiron 15 Series 7000 (7559) with a GTX960M.
Thanks Alex
On kernel 4.9 and using pcie_port_pm=off
allows proper usage of bumblebee, however my external display is not detected. I am powering that via a thunderbolt 3 --> thunderbolt 2 adaptor. It generally lists as DP1
when using pcie_port_pm=on
however is not at all listed otherwise. I have tried using intel-virtual-output
as stated here to no avail.
Edit: realized this was due to another issue and not bumblebee; tb needs to be set to legacy mode to work properly in linux
I'm using a laptop with an Nvidia 940M optimus... was using Bumblebee to switch but after upgrading the kernel to 4.8 and later 4.9 I experienced crashes .... poor start up and shutdown times... all of which stopped when I revert back to using the intel card only with Nouveau.. I am on openSUSE Tumbleweed. Someone suggested I use the proprietary driver only 375.26 with PRIME but it did not really solve anything. Plasma desktop wouldn't start at all
I still get this error:
ERROR: Unable to find the kernel source tree for the currently running kernel. Please make sure you have installed the kernel source files for your kernel and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the 'kernel-source' or 'kernel-devel' RPM installed. If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option.
When I run sudo bumblebee-nvidia --debug on Fedora 25. I installed managed repo using this guide: https://fedoraproject.org/wiki/Bumblebee#Using_bumblebee_software
Kernel: 4.9.11-200.fc25.x86_64
Hardware: Lenovo y510p CPU: i5 4200m GPU: Nvidia GT 755M
@rupek1995 Do you have the kernel-devel package installed and is it the same version as your running kernel? (uname -r)
For some reason lately Fedora has been installing with kernel-debug-devel instead of kernel-devel and when you try to install kernel-devel it will say it's already installed. You need to remove the debug one first, then install the normal debug.On Feb 28, 2017 7:24 AM, gsgatlin notifications@github.com wrote:@rupek1995 Do you have the kernel-devel package installed and is it the same version as your running kernel? (uname -r)
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.
EDIT: After rebooting for the second time system froze for about 30 seconds, and after I logged in there were SELinux errors about systemd, bbswitch, nvidia.ko and gnomeshell. Will removing SELinux fix this?
There was a kernel update after my post, I updated it, managed to successfully download the kernel-devel for my kernel, deleted kernel-debug-devel just in case. Nvidia driver unpacks... but now installation prints out an error like this:
`> -> Kernel module compilation complete.
ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if a driver such as rivafb, nvidiafb, or nouveau is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA graphics device(s), or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver release.
Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information. -> Kernel module load error: Permission denied -> Kernel messages: [ 68.542964] iwlwifi 0000:08:00.0: Radio type=0x2-0x0-0x0 [ 68.595688] IPv6: ADDRCONF(NETDEV_UP): wlp8s0: link is not ready [ 69.121635] wlp8s0: authenticate with 18:a6:f7:65:30:b4 [ 69.125025] wlp8s0: send auth to 18:a6:f7:65:30:b4 (try 1/3) [ 69.127213] wlp8s0: authenticated [ 69.128834] wlp8s0: associate with 18:a6:f7:65:30:b4 (try 1/3) [ 69.133463] wlp8s0: RX AssocResp from 18:a6:f7:65:30:b4 (capab=0x431 status=0 aid=2) [ 69.158037] wlp8s0: associated [ 69.158084] IPv6: ADDRCONF(NETDEV_CHANGE): wlp8s0: link becomes ready [ 71.708158] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 75.815808] Netfilter messages via NETLINK v0.30. [ 75.846240] ip_set: protocol 6 [ 95.459594] tun: Universal TUN/TAP device driver, 1.6 [ 95.459595] tun: (C) 1999-2004 Max Krasnyansky maxk@qualcomm.com [ 95.499851] virbr0: port 1(virbr0-nic) entered blocking state [ 95.499854] virbr0: port 1(virbr0-nic) entered disabled state [ 95.499942] device virbr0-nic entered promiscuous mode [ 96.510578] virbr0: port 1(virbr0-nic) entered blocking state [ 96.510580] virbr0: port 1(virbr0-nic) entered listening state [ 98.320456] virbr0: port 1(virbr0-nic) entered disabled state [ 332.816405] mce: [Hardware Error]: Machine check events logged [ 794.335093] fuse init (API version 7.26) [ 796.061920] Bluetooth: RFCOMM TTY layer initialized [ 796.061928] Bluetooth: RFCOMM socket layer initialized [ 796.061981] Bluetooth: RFCOMM ver 1.11 ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.`
pcie_port_pm=off works for me:
Dell 3542, Fedora 25, kernel 4.9, GeForce 840M
@xen0f0n Thanks for the tip! I already managed to get it to work by changing the SELinux to permissive mode.
If anyone has this problem and pcie_port_pm=off doesn't work for you, you can try SELinux method:
- Update kernel to newest version
- Set SELinux to permissive mode (sudo dnf install /usr/bin/system-config-selinux* - using this tool)
- Reboot Fedora twice - on second reboot bumblebee should install during login (that's why it might freeze for about a minute or two)
- Check if it works - use bumblebee-nvidia --check
- ???
- PROFIT
Hi, instead of disabling runtime PCI power management, would it be OK to selectively enable PCI runtime power management through udev? The following works fine fo far:
First, get the device ids using lspci -k
.
By trial and error, disabling power management for
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05)
Subsystem: ASUSTeK Computer Inc. Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 05)
Kernel driver in use: pcieport
Kernel modules: shpchp
01:00.0 3D controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Mobile] (rev a1)
Subsystem: ASUSTeK Computer Inc. GP107M [GeForce GTX 1050 Mobile]
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
was sufficient. This was possible by the following workaround:
/etc/udev/rules.d/pci_pm.rules
# use lspci -k to get bus ids
#ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:00.0", ATTR{power/control}="auto"
#ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:01.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:02.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:08.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:14.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:14.2", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:15.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:16.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:17.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1c.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1c.3", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1c.6", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1d.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1f.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1f.2", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1f.3", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:00:1f.4", ATTR{power/control}="auto"
#NVIDIA
#ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:01:00.0", ATTR{power/control}="off"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:02:00.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:03:00.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:04:00.0", ATTR{power/control}="auto"
ACTION=="add", SUBSYSTEM=="pci", KERNELS=="0000:05:00.0", ATTR{power/control}="auto"
Using powertop, I could verify that other PCI devices appear to be power managed. And laptop battery usage is almost as good as PM was fully enabled.
If there is an easier way to do this, without going through the bus ids etc, life would be easier. But now I can use the laptop for coding and gaming, without rebooting in between, and with power management enabled.. Please advise if anything is missing or wrong. Thanks..
Actually I can tell you that this works for me.
Run sudo ~/nvidia_reset.sh
. And the driver compiles and installs.
bash-4.4$ cat ~/nvidia_reset.sh
#/bin/bash -eEu
systemctl stop bumblebeed
systemctl stop bumblebee-nvidia
rmmod nvidia
rmmod bbswitch
echo 1 > /sys/bus/pci/devices/0000:01:00.0/remove
echo 1 > /sys/bus/pci/devices/0000:00:01.0/rescan
modprobe bbswitch
systemctl start bumblebeed
systemctl start bumblebee-nvidia
If there is an easier way to do this, without going through the bus ids etc, life would be easier
SUBSYSTEM!="pci", GOTO="pci_end"
ACTION!="add", GOTO="pci_end"
# Disable PM for NVIDIA to overcome "issue" in the nvidia driver
KERNELS=="0000:01:00.0", GOTO="pci_end"
TEST=="power/control", ATTR{power/control}="auto"
LABEL="pci_end"
After updating to linux 4.8 the nvidia driver says your gpu isn't supported when trying to access with primus:
uname:
uname -a Linux hellion 4.8.2-pf #1 SMP PREEMPT Tue Oct 18 10:19:55 CEST 2016 x86_64 GNU/Linux