Bumblebee-Project / bbswitch

Disable discrete graphics (currently nvidia only)
GNU General Public License v2.0
487 stars 78 forks source link

Unable to switch off NVIDA card via bbswitch on Thinkpad T540p #96

Closed oliverwelter closed 8 years ago

oliverwelter commented 10 years ago

Hello, I'm trying to use bbswitch-0.8 for turning off the GeForce GT 730M on a Lenovo Thinkpad T540p. The state shown in /proc/acpi/bbswitch stays on 'ON' independent of load_state and manual setting. The power consumption supports the suspicion that the card is still running.

cat /etc/modprobe.d/bbswitch.conf:
options bbswitch load_state=0 unload_state=1

I also tried to switch off manually: 
cat /proc/acpi/bbswitch:
0000:01:00.0 ON

tee /proc/acpi/bbswitch <<<OFF:
OFF

cat /proc/acpi/bbswitch:
0000:01:00.0 ON

dmesg | grep -C 10 bbswitch:
  80.096972] IPv6: ADDRCONF(NETDEV_UP): wlp4s0: link is not ready
[   80.102515] iwlwifi 0000:04:00.0: L1 Enabled; Disabling L0S
[   80.102856] iwlwifi 0000:04:00.0: L1 Enabled; Disabling L0S
[   80.114086] IPv6: ADDRCONF(NETDEV_UP): wlp4s0: link is not ready
[   80.220897] e1000e: enp0s25 NIC Link is Down
[   80.318107] e1000e 0000:00:19.0: irq 42 for MSI/MSI-X
[   80.418589] e1000e 0000:00:19.0: irq 42 for MSI/MSI-X
[   80.418666] IPv6: ADDRCONF(NETDEV_UP): enp0s25: link is not ready
[   83.141353] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   83.141382] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s25: link becomes ready
[ 1696.021986] bbswitch: version 0.8
[ 1696.021990] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.VID_
[ 1696.021995] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.PEG_.VID_
[ 1696.022003] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140424/nsarguments-95)
[ 1696.022308] bbswitch: detected an Optimus _DSM function
[ 1696.022318] pci 0000:01:00.0: enabling device (0000 -> 0003)
[ 1696.022345] bbswitch: disabling discrete graphics
[ 1696.022348] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140424/nsarguments-95)
[ 1696.033095] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
[ 1718.636370] bbswitch: disabling discrete graphics
[ 1718.636381] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140424/nsarguments-95)
[ 1718.636910] ------------[ cut here ]------------
[ 1718.636917] WARNING: CPU: 1 PID: 6333 at drivers/pci/pci.c:1535 pci_disable_device+0x9c/0xb0()
[ 1718.636920] pci 0000:01:00.0: disabling already-disabled device
[ 1718.636921] Modules linked in:
[ 1718.636922]  bbswitch(O) ctr ccm acpi_call(O) dm_crypt wacom snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core hid_logitech_dj snd_hda_codec_hdmi arc4 snd_hda_codec_generic x86_pkg_temp_thermal dm_mod kvm_intel kvm iwlmvm mac80211 crc32c_intel aesni_intel aes_x86_64 i915 joydev iwlwifi fbcon bitblit cfbfillrect softcursor font cfg80211 cfbimgblt i2c_algo_bit cfbcopyarea wmi drm_kms_helper drm fb thinkpad_acpi fbdev snd_hda_intel intel_gtt snd_hda_controller e1000e rfkill agpgart snd_hda_codec ehci_pci ptp ehci_hcd video pps_core snd_hwdep vboxnetadp(O) vboxnetflt(O) vboxdrv(O)
[ 1718.636958] CPU: 1 PID: 6333 Comm: tee Tainted: G           O  3.16.3-gentoo #2
[ 1718.636960] Hardware name: LENOVO 20BFS05L00/20BFS05L00, BIOS GMET66WW (2.14 ) 07/01/2014
[ 1718.636961]  0000000000000009 ffff880370f3bdd8 ffffffff8156f45b ffff880370f3be20
[ 1718.636963]  ffff880370f3be10 ffffffff81060a28 ffff8804283ba000 ffff88042836ca30
[ 1718.636965]  0000000000000004 ffff880370f3bf50 0000000000000008 ffff880370f3be70
[ 1718.636967] Call Trace:
[ 1718.636973]  [<ffffffff8156f45b>] dump_stack+0x4e/0x7a
[ 1718.636976]  [<ffffffff81060a28>] warn_slowpath_common+0x78/0xa0
[ 1718.636978]  [<ffffffff81060a97>] warn_slowpath_fmt+0x47/0x50
[ 1718.636980]  [<ffffffff812ca6ac>] pci_disable_device+0x9c/0xb0

uname -a:
Linux barret 3.16.3-gentoo #2 SMP PREEMPT Sat Sep 20 12:56:47 CEST 2014 x86_64 Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz GenuineIntel GNU/Linux

Linux: Gentoo XOrg: 1.16.0.901 NVIDIA-Driver: official 343.13-r1 bbswitch-0.8

At the moment it is not possible to add comments on https://bugs.launchpad.net/bugs/752542 because of timeouts. I'm going to upload and link to the comment as soon as launchpad works again.

Kind Regards, Oliver

Lekensteyn commented 10 years ago

Did it work on older kernels? Can you reproduce it on newer kernels?

oliverwelter commented 10 years ago

I tried 3.17.0 getting a similar error message:

[  241.014312] bbswitch: detected an Optimus _DSM function
[  241.014326] pci 0000:01:00.0: enabling device (0000 -> 0003)
[  241.014368] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
[  286.792485] bbswitch: disabling discrete graphics
[  286.792508] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140724/nsarguments-95)
[  520.003960] bbswitch: disabling discrete graphics
[  520.003986] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140724/nsarguments-95)
[  520.005243] ------------[ cut here ]------------
[  520.005260] WARNING: CPU: 2 PID: 5391 at drivers/pci/pci.c:1535 pci_disable_device+0x9c/0xb0()
[  520.005265] pci 0000:01:00.0: disabling already-disabled device
[  520.005268] Modules linked in:
[  520.005273]  bbswitch(O) ctr ccm dm_crypt uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core hid_logitech_dj snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device snd_hda_codec_hdmi joydev arc4 iwlmvm mac80211 x86_pkg_temp_thermal kvm_intel dm_mod kvm iwlwifi crc32c_intel aesni_intel aes_x86_64 cfg80211 snd_hda_codec_generic i915 fbcon bitblit cfbfillrect softcursor cfbimgblt font i2c_algo_bit cfbcopyarea wmi drm_kms_helper drm thinkpad_acpi fb fbdev snd_hda_intel intel_gtt e1000e rfkill agpgart snd_hda_controller ehci_pci snd_hda_codec ptp ehci_hcd snd_hwdep pps_core video vboxnetadp(O) vboxnetflt(O) vboxdrv(O)
[  520.005370] CPU: 2 PID: 5391 Comm: tee Tainted: G           O   3.17.0-gentoo #1
[  520.005375] Hardware name: LENOVO 20BFS05L00/20BFS05L00, BIOS GMET66WW (2.14 ) 07/01/2014
[  520.005379]  0000000000000009 ffff88008b1efdd8 ffffffff8157a982 ffff88008b1efe20
[  520.005387]  ffff88008b1efe10 ffffffff81062078 ffff88042838a000 ffff88042833fb10
[  520.005394]  0000000000000004 ffff88008b1eff50 0000000000000008 ffff88008b1efe70
[  520.005401] Call Trace:
[  520.005413]  [<ffffffff8157a982>] dump_stack+0x4e/0x7a
[  520.005423]  [<ffffffff81062078>] warn_slowpath_common+0x78/0xa0
[  520.005430]  [<ffffffff810620e7>] warn_slowpath_fmt+0x47/0x50
[  520.005437]  [<ffffffff812cff2c>] pci_disable_device+0x9c/0xb0
[  520.005443]  [<ffffffffa00681de>] 0xffffffffa00681de
[  520.005449]  [<ffffffffa00684dd>] 0xffffffffa00684dd
[  520.005458]  [<ffffffff811cafd8>] proc_reg_write+0x38/0x70
[  520.005468]  [<ffffffff81170432>] vfs_write+0xb2/0x1f0
[  520.005475]  [<ffffffff81170f81>] SyS_write+0x41/0xb0
[  520.005481]  [<ffffffff8158178b>] tracesys+0xdd/0xe2
[  520.005487] ---[ end trace c7bfb1b70d523268 ]---
oliverwelter commented 10 years ago

I'm going to test 3.14.21, 3.12.30 and 3.10.57

Lekensteyn commented 10 years ago

Please put those dumps in backticks to avoid messing up formatting:

   code here

Could you upload your acpidump to https://gist.github.com?

oliverwelter commented 10 years ago

I uploaded acpidump.txt to https://gist.github.com/oliverwelter/eff05549b68780f52dc2

oliverwelter commented 10 years ago

I checked a couple of older kernels and I was able to switch off the card using kernels 3.10.57, 3.12.30 and 3.14.21. The ACPI warning is still present in the kernel log but reading /proc/acpi/bbswitch showed OFF after disabling.

On the other hand the described issues was present in version 3.16.6, 3.17.0 and 3.17.1. /proc/acpi/bbswitch shows ON all the time and if I try to disable the card twice the shown stacktrace appears.

leoluk commented 10 years ago

Works fine on 3.16.3.

bluec0re commented 10 years ago

Same issue on Arch Linux. Kernel 3.17.2 bbswitch 0.8 NVIDIA Quadro K2100M

Also with the same ACPI warnings

oliverwelter commented 9 years ago

Same issue with kernel 3.17.3 .

oliverwelter commented 9 years ago

Same issue with kernel 3.17.4

oliverwelter commented 9 years ago

Same issue with kernel 3.18.0

TheAifam5 commented 9 years ago

The same with Linux-CK 3.17-6 nVidia GT 540M Samsung RC730 - d0dde

oliverwelter commented 9 years ago

Like suggested in some other ongoing issues I successfully applied the kernel command parameter acpi_osi="!Windows 2013". Afterwards I was able to switch off the NVIDIA card and the power consumption decreased as well.

bluec0re commented 9 years ago

Works for me. Works like a charm now with Arch Linux vanilla Kernel 3.17.6-1.

Thx!

jlippi commented 9 years ago

acpi_osi="!Windows 2013" allowed me to make things work with my nvidia card on my Thinkpad W540 as well. Thanks very, very much oliverwelter - this was a huge frustration! Also had to turn off nomodeset.

slartibart70 commented 9 years ago

Hi all, i just tried out your idea with the acpi-osi kernel switch. A cat /proc/acpi/bbswitch indicated 'OFF' All went fine till i tried to run 'optirun glxgears -info' which severely killed my ext4-root filesystem because of a system-crash? Not sure, the kde-gui was still active, but dmesg showed a crash - so i rebooted... Final result was a filesystem corruption in / (root) As i was playing with setting on/off by echoing to bbswitch, other kernel panics occurred. I think i should i wait until those bugs are fixed...? (all tests happened on a fresh install of fedora21 on a lenovo t440p)

andrewgdunn commented 9 years ago

If anyone can provide me input, specifically @jlippi because I'm trying this on a W541. I'm in Fedora 21, bumblebee, bbswitch, and bumblebee-nvidia are all installed. The bumblebeed service is running. I have the following in my /etc/modprobe.d/bumblebee.conf:

blacklist nvidia blacklist nouveau options bbswitch load_state=0 options skip_optimus_dsm=1 # from the readme related to T4xx laptops

When I boot the laptop, it loads gnome with the intel Haswell mobile graphics, however the battery life is less than 3 hours at best. I suspected the discrete adapter is on, and when executing:

cat /proc/acpi/bbswitch 0000:01:00.0 ON

If I try to turn it off via the tee methood:

tee /proc/acpi/bbswitch <<<OFF OFF cat /proc/acpi/bbswitch 0000:01:00.0 ON

Here is my dmesg | tail:

[ 49.891108] NET: Registered protocol family 31 [ 49.891109] Bluetooth: HCI device and connection manager initialized [ 49.891113] Bluetooth: HCI socket layer initialized [ 49.891115] Bluetooth: L2CAP socket layer initialized [ 49.891124] Bluetooth: SCO socket layer initialized [ 49.894015] Bluetooth: BNEP (Ethernet Emulation) ver 1.3 [ 49.894017] Bluetooth: BNEP filters: protocol multicast [ 49.894021] Bluetooth: BNEP socket layer initialized [ 65.834595] bbswitch: disabling discrete graphics [ 65.834607] ACPI Warning: SB.PCI0.PEG.VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires Package

I tried the kernel parameter of acpi_osi="!Windows 2013" but that didn't seem to make a difference, I get the same dmesg output when booting with and without that parameter. I've noticed that I also have some other kernel parameters that I assume were added after I installed bumblebee: nouveau.modeset=0 rd.driver.blacklist=nouveau

I'm pretty desperate for help, I want to use this machine but I don't think I can if its going to have less than 3 hours battery life and feel so hot to the touch all the time.

gsgatlin commented 9 years ago

@storrgie you need to do the following:

edit the file

/boot/efi/EFI/fedora/grub.cfg

and add

acpi_osi='!Windows 2013'

to the kernel parameters.

I have a t540p with fedora 21 and its working for me after these changes. I also created these rpms for fedora. I don't think the dmesg output messages are important. Check with:

cat /proc/acpi/bbswitch

after a reboot.

marcolaux commented 9 years ago

please check if the boot parameter is there with cat /proc/cmdline

if not there might be problems with the quotes in your /etc/defaults/grub. try with "\" like this: acpi_osi=\"!Windows 2013\"

edit, just saw that gsgatlin suggests using ' that should also work.

jlippi commented 9 years ago

That modeset parameter looks suspect. I had to turn nomodeset off. Try getting that off. On Apr 25, 2015 8:21 AM, "marco" notifications@github.com wrote:

please check if the boot parameter is there with cat /proc/cmdline

if not there might be problems with the quotes in your /etc/defaults/grub. try with "\" like this: acpi_osi=\"!Windows 2013\"

— Reply to this email directly or view it on GitHub https://github.com/Bumblebee-Project/bbswitch/issues/96#issuecomment-96221183 .

marcolaux commented 9 years ago

if you mean nouveau.nomodeset - it does not matter if you installed the proprietary nvidia driver. it blacklists the nouveau in /etc/modprobe.d anyway.

all you need is the acpi_osi parameter.

jeannotalpin commented 9 years ago

Same problem for me on kernel 3.10.0-229.1.2.el7.x86_64 (Centos 7). Adding acpi_osi=\"!Windows 2013\" does not solve the problem. Output in dmesg when switching OFF nvidia card: "ACPI Warning: SB.PCI0.PEG.VID._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires Package"

marcolaux commented 9 years ago

i got the same message. but it it works on my W540.

lspci -v: 01:00.0 VGA compatible controller: NVIDIA Corporation GK106GLM [Quadro K2100M](rev ff) (prog-if ff) !!! Unknown header type 7f

/cat/proc/bbswitch: 0000:01:00.0 OFF

when i switch it on via bbswitch power consumption increases by about 5 watts.

marcolaux commented 9 years ago

@jeannotalpin you got an pretty old kernel there. for kernels < 3.15 i guess the boot parameter was acpi_osi=\"!Windows 2012\" - not 2013

andrewgdunn commented 9 years ago

Thanks @hyphone, @jlippi, @gsgatlin. I was able to boot with the card off and I saw immediately that I was going to have an estimated 1 hour more of battery life.

I now have another issue: Optirun wont work, and once run the card wont turn back off.

This is a fresh install of Fedora 22. I have updated and installed the bumblebee, bbswitch, and bumblebee-nvidia packages. I installed the dependencies enumerated within the Fedora Wiki. selinux is enabled (i really prefer to keep it enabled) and I've not made any other changes other than what is mentioned below.

I then changed the parameters by editing /etc/default/grub and adding the acpi_osi parameter:

BOOT_IMAGE=/vmlinuz-4.0.0-1.fc22.x86_64 root=UUID=41f77cd6-196b-4490-b41a-a28aab63e00a ro rootflags=subvol=root rd.luks.uuid=luks-65bb39d2-ec92-4fed-a2bd-77b19b05f1a1 rd.luks.uuid=luks-a45e0c46-0b27-4372-9b70-8827d606aa27 "acpi_osi=!Windows 2013" rhgb quiet

I then make that modification take effect by running:

grub2-mkconfig -o /boot/efi/EFI/fedora/grub.cfg

One booting I can see the card is off by checking /proc/acpi/bbswitch.

I then run optirun glxgears and see the following:

[ 742.237302] [ERROR]Cannot access secondary GPU - error: XORG systemd-logind: failed to get session: PID 3365 does not belong to any known session [ 742.267343] [ERROR]Aborting because fallback start is disabled.

After this the card appears to be on, and my estimate for battery life drops by at-least 1 hour.

I'm impressed that the card can start off, however I'm sad I can't use the card.

Another thing I've noticed is that within gnome I cannot open up settings. I'm not sure if its related.

gsgatlin commented 9 years ago

@storrgie I plan on upgrading to fedora 22 some time after may 26th when it is officially released by the fedora project.

I had done some work on fedora 22 in the past, see https://github.com/Bumblebee-Project/Bumblebee/issues/647 for example. But I am not using it now. I am using fedora 21 and kernel 3.19 on this particular system.

Perhaps they have changed something again in the policies. I do not have a spare t540p to test with. This is a work provided laptop. And there is only one of them.

I would suggest testing in "permissive" mode after a reboot to see if it is a selinux issue.

I would also check to see if the nvidia drivers compiled successfully with

bumblebee-nvidia --check

I will hopefully have time before May 26th to test fedora 22 again on a lenovo y470 I own personally.

If neither of these suggestions help you then you may wish to open a new issue concerning fedora 22.

marcolaux commented 9 years ago

@storrgie can you post the content of /var/log/Xorg.8.log (this is the log file of the seperate bumblebee X server)?

jeannotalpin commented 9 years ago

@hyphone Centos kernel is quite old indeed. I tried with Windows 2012 instead of 2013 but same issue. It used to work with the previous Centos kernel though, without acpi_osi.

andrewgdunn commented 9 years ago

@gsgatlin @hyphone turning selinux to permissive/disabled allows bumblebeed and bbswitch to work properly (start card to run glxgears, stop card when I close the glxgears window).

I'm going to do as @gsgatlin suggests and throw F21 on this machine right now to see if everything works relatively out-of-the-box.

Another issue that I'm not sure how to get logs for is that whenever I open up the settings window in gnome shell, or any sub-window that should bring me to settings (such as power settings), it appears to launch (the top bar changes name) but then the window never shows up. I'm not even sure how to start helping the Fedora folks understand that problem.

I'll report back after the F21 install.

andrewgdunn commented 9 years ago

@gsgatlin this is a bit meta, sorry, I started with F22 because I wanted to see how the integrated graphics card would do with the wayland gdm boot, and with wayland gnome. I'm still within the window where I can return the W541 and choose to go with a pure Intel model such as the T550.

I guess my fear is, if I get the W541 working with F21, will it survive moving from F21 through F24 with me before I snag a newer model. During that time there is a lot of changes in the display sub-systems and I'm concerned that Optimus adds a level of complexity that might make the system unusable for F22+.

I'd be interested in your thoughts.

gsgatlin commented 9 years ago

@storrgie Probably they changed something in the fedora 22 selinux policies. I will try to see if I can get it to work in enforcing mode in fedora 22 hopefully next week some time. No promises that I can get it to work but I will try my best. They seem to be making a lot of changes lately in selinux that seem to affect bumblebee.

You can use fedora 22 beta version if you like it. You'll just have to use permisive mode until the selinux issues get worked out. But before I can look at that I have to push a new virtualgl. (2.4) `cause I'm rather late in getting that out. Sorry about that.

I think the problems you are having with the settings window stuff is probably a different bug. You may want to open a bugzilla about that problem. I doubt it is related to hybrid graphics. I personally use MATE desktop so I have not seen the bug you describe yet.

andrewgdunn commented 9 years ago

@gsgatlin I've got a T540p that has only the intel graphics card and 3k screen. All things being equal (99whr battery on both systems), I'm seeing 6+ hours of life on the T540p and ~4 or less on the W541. The only differences would be the SSD manufacture and memory (W541 has 32g). Even with the discrete adapter off is there still energy being used from it?

gsgatlin commented 9 years ago

@storrgie I'm sorry. I don't have a lot of ideas on how you can troubleshoot your battery life problems. I think my work provided t540p gets like 5-6 hours on battery but I've never run it down all the way so I'm not sure about that. I do know its better then my y470 from 2011. My lenovo ideapad y470 gets more like 2 hours but I always assumed that was a battery being smaller and due to it being older.

maybe you can install windows and check the difference with a stopwatch? Or maybe you can use utilities such as powertop or tlp to troubleshoot but I am no expert at that. Maybe some other folks will have some ideas you can try. Sorry I could not be more help with that.

AFAIK there should be no energy drain if the card is powered off by bbswitch but I guess you would need to do more tests with and without bbswitch/bumblebee to see if it made much difference?

andrewgdunn commented 9 years ago

@gsgatlin Fedora 21 is rock solid. I did compare to the T540p again and I'm getting around 4 hours max on the W541, where the T540p is well over 6. I'm impressed with the pure Intel system.

I'll be checking back in once you're fiddling with the selinux policies, I'm happy to test for you.

andrewgdunn commented 9 years ago

@gsgatlin I'm actually thinking there may still be an issue... I'm seeing in powertop that 'GPU misc' is periodically using 15w of power. I've checked /proc/acpi/bbswitch and it says off. Is 'GPU mics' a catch all for the iGPU and dGPU?

gsgatlin commented 9 years ago

@storrgie I'm not sure but maybe some other people watching this thread might know. Or maybe the bumblebee developers may have some ideas about what that means in powertop.

slartibart70 commented 9 years ago

Hi all, i gave bbswitch another try (also using the 'acpi_osi=!Windows 2013') parameter with kernel 3.19.5 on fedora21. Yes, the nvidia card is switched OFF according to /proc/acpi/bbswitch on my T440, but all sort of strange things happen after the laptop wakes up again after sleeping (closed the lid yesterday night being on battery, opening it up again after being connected to AC power)

Last time, i had filesystem corruptions (see comment above), this time i had problems logging in because of 'authorization susbsystem not available'. I managed to log in using the virtual consoles and rebooted without the acpi_osi parameter. Since then, everything is fine again - sleeping, waking up ... no problems. But power consumption is up due to nvidia card being ON again... Any ideas or hints about what happens and how to circumvent it?

marcolaux commented 9 years ago

@slartibart70: you are running into this problem: https://github.com/Bumblebee-Project/bbswitch/issues/78

slartibart70 commented 9 years ago

Thanks!!! According to this thread, there is no solution at the moment? Best to keep nvidia on all the time....? Unsatisfactory....

marcolaux commented 9 years ago

or turn it off completely, yes. somewhere in that thread is a method called by acpi_call. with this you can turn your nvidia card completely off (so it's not switchable).

what you can do is:

i made something like this for an older macbook pro 5 (two nvidia chips) where the only possibility was to switch the gpu on boot. have a look at the scripts if you are interested: https://github.com/hyphone/mbp5linux

andrewgdunn commented 9 years ago

@gsgatlin I was looking at lspci -v and saw something that looks suspect:

01:00.0 VGA compatible controller: NVIDIA Corporation GK106GLM [Quadro K2100M](rev ff) (prog-if ff) !!! Unknown header type 7f Kernel modules: nouveau, nvidia

Shouldn't I only be seeing nvidia only here?

I get that output when my dGPU adapter is in the OFF state, when I run optirun glxgears it changes to this:

01:00.0 VGA compatible controller: NVIDIA Corporation GK106GLM [Quadro K2100M](rev a1) (prog-if >00 [VGA controller]) Subsystem: Lenovo Device 2211 Flags: bus master, fast devsel, latency 0, IRQ 35 Memory at b2000000 (32-bit, non-prefetchable) [size=16M] Memory at a0000000 (64-bit, prefetchable) [size=256M] Memory at b0000000 (64-bit, prefetchable) [size=32M] I/O ports at 4000 [size=128] [virtual] Expansion ROM at b3000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900] #19 Kernel driver in use: nvidia Kernel modules: nouveau, nvidia

cat /proc/cmdline gives:

BOOT_IMAGE=/vmlinuz-3.19.4-200.fc21.x86_64 root=UUID=db09bd02-5c7f-4f99-af20-bc2455d4e82e ro rootflags=subvol=root rd.luks.uuid=luks-dd790a61-bfb1-40dd-bca7-c5e911840dab rd.luks.uuid=luks-b37b7561-ad7c-4a1e-9b02-607ef515d68b rhgb "acpi_osi=!Windows 2013" quiet nouveau.modeset=0 rd.driver.blacklist=nouveau

gsgatlin commented 9 years ago

@storrgie I'm really sorry. I'm not a developer and I don't really understand that well how this stuff works. Perhaps the bumblebee developers will have some ideas? I am just a fedora user who became frustrated with a lack of rpms for my distro so I tried to make some and place them at install.linux.ncsu.edu. But I am not a c programmer, for example.

Here is what I get on my work provided laptop. I lenovo thinkpad t540p.

lspci -v | grep VGA 00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller]) 01:00.0 VGA compatible controller: NVIDIA Corporation GK208M [GeForce GT 730M](rev ff) (prog-if ff) [gsgatlin@t540p ~]$ lspci -v | grep VGA 00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller]) 01:00.0 VGA compatible controller: NVIDIA Corporation GK208M [GeForce GT 730M](rev a1) (prog-if 00 [VGA controller])

The bottom one is with "optirun glxgears" running at the same time.

In the UEFI screen it says:

UEFI BIOS Version GMET69WW (2.17) UEFI BIOS Date (Year-Month-Day) 2015-01-12

I am lucky in that I have not seen any kind of filesystem corruptions. At least not yet. I've only been using it for a couple of weeks.

andrewgdunn commented 9 years ago

Sorry that I keep directing at you @gsgatlin, I hope some other devs see our discussion!

marcolaux commented 9 years ago

@storrgie the following output is perfectly ok because it looks like this when the NVIDIA GPU got turned off by bbswitch

01:00.0 VGA compatible controller: NVIDIA Corporation GK106GLM Quadro K2100M (prog-if ff) !!! Unknown header type 7f

the periodical increase of our power usage might be related to CPU scaling and could be unrelated

sharms commented 9 years ago

@storrgie - I have a W541 also, and do not believe bbswitch is capable of powering down the GPU. The heat location indicates the GPU is on, and using acpi_osi breaks a bunch of things like suspend, resume etc

marcolaux commented 9 years ago

I have the W540 that should be nearly the same as the W541 except the touchpad. It works for me, GPU is off and with acpi_osi resume, suspend and everything is working. Just software hibernation from the linux side does not work but I'm using Intel Rapid Start (hardware hibernation) and that is working perfectly, too.

sharms commented 9 years ago

Actually setting it to !Windows 2013 does make it work, only problems I can find the ACPI Lid indicator (but forcing a suspend using 'systemctl suspend' does work. This is on a W541 with 4.0.8: Linux soar 4.0.8-300.fc22.x86_64 #1 SMP Fri Jul 10 21:04:56 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

andrewgdunn commented 9 years ago

Would like to see if I can get some eyes on this, was working fine until I jumped to the 4.1.x kernel in F22. Now the dGPU defaults to booting ON. I've opened a new bug here: https://github.com/Bumblebee-Project/bbswitch/issues/113

marcolaux commented 9 years ago

4.1.3-1-ARCH here and everything OK

ghost commented 9 years ago

I've the same problem with my T450s (with Nvidia 950M).

Es ist ein Kernel-Problem aufgetreten, aber Ihr Kernel ist defekt (flags:GOE). Kernel-Maintainer können keine Berichte beschädigter Kernel analysieren.Betroffene Module: bbswitch.

Translation: A Kernel problem occurred, but the kernel is broken / defect (flags: GOE). Kernel-Maintainer can not analyze reports of broken kernels. Affected module: bbswitch.

Reason: WARNING: CPU: 2 PID: 18356 at drivers/pci/pci.c:1550 pci_disable_device+0xba/0xd0()

OS info: Fedora 22. Kernel: 4.1.6-200.fc22.x86_64

dnesg: WARNING: CPU: 2 PID: 18356 at drivers/pci/pci.c:1550 pci_disable_device+0xba/0xd0() [ 1380.077084] pci 0000:04:00.0: disabling already-disabled device [ 1380.077085] Modules linked in: [ 1380.077086] rfcomm fuse ccm cmac xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnep arc4 iwlmvm iTCO_wdt mac80211 iTCO_vendor_support intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp iwlwifi snd_hda_codec_hdmi uvcvideo snd_hda_codec_realtek snd_hda_codec_generic videobuf2_vmalloc cfg80211 kvm videobuf2_core snd_hda_intel snd_hda_controller videobuf2_memops btusb snd_hda_codec v4l2_common [ 1380.077122] videodev btbcm snd_hda_core btintel snd_hwdep rtsx_pci_ms bluetooth snd_seq memstick i2c_i801 media mei_me snd_seq_device lpc_ich thinkpad_acpi snd_pcm shpchp mei rfkill snd_timer tpm_tis tpm snd soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt 8021q garp stp llc mrp i915 rtsx_pci_sdmmc mmc_core crct10dif_pclmul crc32_pclmul crc32c_intel i2c_algo_bit drm_kms_helper ghash_clmulni_intel e1000e drm serio_raw rtsx_pci ptp mfd_core pps_core wmi video bbswitch(OE) [ 1380.077218] CPU: 2 PID: 18356 Comm: tee Tainted: G OE 4.1.6-200.fc22.x86_64 #1 [ 1380.077222] Hardware name: LENOVO 20BWS1UT00/20BWS1UT00, BIOS JBET50WW (1.15 ) 06/10/2015 [ 1380.077224] 0000000000000000 000000007dc95ada ffff88033c7efcc8 ffffffff8179b97d [ 1380.077228] 0000000000000000 ffff88033c7efd20 ffff88033c7efd08 ffffffff810a165a [ 1380.077231] 0000000000000246 ffff88033f6c9000 ffff88033f64bc40 ffff88033c7eff18 [ 1380.077234] Call Trace: [ 1380.077245] [] dump_stack+0x45/0x57 [ 1380.077261] [] warn_slowpath_common+0x8a/0xc0 [ 1380.077266] [] warn_slowpath_fmt+0x55/0x70 [ 1380.077278] [] ? pci_set_master+0x3b/0x100 [ 1380.077282] [] pci_disable_device+0xba/0xd0 [ 1380.077307] [] bbswitch_off+0xb5/0x260 [bbswitch] [ 1380.077322] [] bbswitch_proc_write+0xa5/0xc5 [bbswitch] [ 1380.077340] [] proc_reg_write+0x42/0x70 [ 1380.077348] [] __vfs_write+0x37/0x110 [ 1380.077352] [] ? sb_start_write+0x58/0x120 [ 1380.077362] [] ? security_file_permission+0x23/0xa0 [ 1380.077364] [] vfs_write+0xa6/0x1c0 [ 1380.077374] [] ? __schedule+0x241/0x720 [ 1380.077375] [] ? vfs_read+0x11e/0x140 [ 1380.077377] [] SyS_write+0x59/0xd0 [ 1380.077380] [] ? schedule+0x37/0x90 [ 1380.077388] [] system_call_fastpath+0x12/0x71 [ 1380.077396] ---[ end trace 2efb6a32ea8bee6f ]---