Bumblebee-Project / bbswitch

Disable discrete graphics (currently nvidia only)
GNU General Public License v2.0
487 stars 78 forks source link

segfault after attempt to switch back ON nvidia card on ul30vt #114

Open hamacekh opened 9 years ago

hamacekh commented 9 years ago

I tried following:

# modprobe bbswitch
# echo OFF > /proc/acpi/bbswitch
# echo ON > /proc/acpi/bbswitch

OFF seems to be working fine. Power consumption drops. Card seems to be offline. ON doesnt turn on the card. Other ways to achieve same goal fail as well.

# modprobe bbswitch load_state=0 unload_state=1
# modprobe -r bbswitch

again, bbswitch loads ok, turns card off ok. it cannot unload with message: device or resource busy

Including system information as described in https://github.com/Bumblebee-Project/bbswitch#reporting-bugs Linux hami-laptop-lin 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u3 (2015-08-04) x86_64 GNU/Linux Video driver: nouveau 1:1.0.11-1 Video card: geforce 210m Xorg version: 1:7.7+7 Distribution: Debian 8 stable System info: https://bugs.launchpad.net/lpbugreporter/+bug/752542/+attachment/4444002/+files/ASUSTeK_Computer_Inc.-UL30VT.tar.gz

dump_info: 0000:00:00.0 060000 0000:00:01.0 060400 SB.PCI0.P0P1 0000:00:02.0 030000 SB.PCI0.VGA_ 0000:00:1a.0 0c0300 SB.PCI0.USB3 0000:00:1a.1 0c0300 SB.PCI0.USB4 0000:00:1a.2 0c0300 SB.PCI0.USB6 0000:00:1a.7 0c0320 SB.PCI0.USBE 0000:00:1b.0 040300 SB.PCI0.HDAC 0000:00:1c.0 060400 SB.PCI0.P0P2 0000:00:1c.1 060400 SB.PCI0.P0P3 0000:00:1c.5 060400 SB.PCI0.P0P8 0000:00:1d.0 0c0300 SB.PCI0.USB0 0000:00:1d.1 0c0300 SB.PCI0.USB1 0000:00:1d.2 0c0300 SB.PCI0.USB2 0000:00:1d.7 0c0320 SB.PCI0.EUSB 0000:00:1e.0 060401 SB.PCI0.P0P9 0000:00:1f.0 060100 SB.PCI0.SBRG 0000:00:1f.2 010601 SB.PCI0.IDE0 0000:03:00.0 028000 SB.PCI0.P0P3.WLAN 0000:04:00.0 020000 SB.PCI0.P0P8.LAN_

dmesg | grep -C 10 bbswitch:
[   39.933460] wlan0: authenticated
[   39.936134] wlan0: associate with 10:fe:ed:39:c2:26 (try 1/3)
[   39.940598] wlan0: RX AssocResp from 10:fe:ed:39:c2:26 (capab=0x431 status=0 aid=2)
[   39.948913] wlan0: associated
[   39.948945] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[  328.272593] perf interrupt took too long (2516 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[  482.901377] Key type dns_resolver registered
[  482.914694] FS-Cache: Netfs 'cifs' registered for caching
[  482.914737] Key type cifs.spnego registered
[  482.914749] Key type cifs.idmap registered
[ 1372.492485] bbswitch: version 0.8
[ 1372.492502] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.VGA_
[ 1372.492525] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.P0P1.VGA_
[ 1372.492545] ACPI Warning: \_SB_.PCI0.P0P1.VGA_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140424/nsarguments-95)
[ 1372.492626] ACPI Warning: \_SB_.PCI0.P0P1.VGA_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140424/nsarguments-95)
[ 1372.492772] bbswitch: detected a nVidia _DSM function
[ 1372.492795] pci 0000:01:00.0: enabling device (0000 -> 0003)
[ 1372.492865] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
[ 1416.523643] bbswitch: disabling discrete graphics
[ 1416.536138] ACPI Warning: \_SB_.PCI0.P0P1.VGA_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140424/nsarguments-95)
[ 1416.572941] pciehp 0000:00:01.0:pcie04: Card not present on Slot(16)
[ 1416.666697] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=none,decodes=io+mem:owns=none
[ 1440.120452] Monitor-Mwait will be used to enter C-2 state
[ 1440.120463] Monitor-Mwait will be used to enter C-3 state
[ 1454.733253] BUG: unable to handle kernel paging request at 000000003d5f8038
[ 1454.736143] IP: [<ffffffffa0911035>] dis_dev_get+0x15/0x40 [bbswitch]
[ 1454.737009] PGD 7205f067 PUD 721f0067 PMD 0 
[ 1454.737009] Oops: 0000 [#1] SMP 
[ 1454.737009] Modules linked in: bbswitch(O) md4 hmac nls_utf8 cifs dns_resolver ctr ccm binfmt_misc bnep pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hda_codec_hdmi nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc joydev snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support mxm_wmi ecb uvcvideo btusb bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common 6lowpan_iphc videodev media arc4 iwldvm mac80211 coretemp kvm_intel kvm evdev psmouse serio_raw pcspkr iwlwifi snd_hda_intel snd_hda_controller lpc_ich mfd_core snd_hda_codec i915 snd_hwdep snd_pcm cfg80211 snd_timer snd soundcore drm_kms_helper asus_laptop sparse_keymap drm rfkill i2c_algo_bit input_polldev shpchp tpm_tis i2c_core tpm wmi acpi_cpufreq ac video battery processor button
Lekensteyn commented 9 years ago

This is the problematic part, can you show the full trace following the BUG?

[ 1454.733253] BUG: unable to handle kernel paging request at 000000003d5f8038
[ 1454.736143] IP: [<ffffffffa0911035>] dis_dev_get+0x15/0x40 [bbswitch]
[ 1454.737009] PGD 7205f067 PUD 721f0067 PMD 0 
[ 1454.737009] Oops: 0000 [#1] SMP 
[ 1454.737009] Modules linked in: bbswitch(O) md4 hmac nls_utf8 cifs dns_resolver ctr ccm binfmt_misc bnep pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hda_codec_hdmi nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc joydev snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support mxm_wmi ecb uvcvideo btusb bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common 6lowpan_iphc videodev media arc4 iwldvm mac80211 coretemp kvm_intel kvm evdev psmouse serio_raw pcspkr iwlwifi snd_hda_intel snd_hda_controller lpc_ich mfd_core snd_hda_codec i915 snd_hwdep snd_pcm cfg80211 snd_timer snd soundcore drm_kms_helper asus_laptop sparse_keymap drm rfkill i2c_algo_bit input_polldev shpchp tpm_tis i2c_core tpm wmi acpi_cpufreq ac video battery processor button

By the way, where does "3.16.7-ckt11-1+deb8u3" come from? Are you actually running that -ck kernel or the default one included with Debian?

hamacekh commented 9 years ago

I hope this is what you are looking for:

 [ 1454.733253] BUG: unable to handle kernel paging request at 000000003d5f8038
 [ 1454.736143] IP: [<ffffffffa0911035>] dis_dev_get+0x15/0x40 [bbswitch]
 [ 1454.737009] PGD 7205f067 PUD 721f0067 PMD 0 
 [ 1454.737009] Oops: 0000 [#1] SMP 
 [ 1454.737009] Modules linked in: bbswitch(O) md4 hmac nls_utf8 cifs dns_resolver ctr ccm binfmt_misc bnep pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) snd_hda_codec_hdmi nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc joydev snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support mxm_wmi ecb uvcvideo btusb bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common 6lowpan_iphc videodev media arc4 iwldvm mac80211 coretemp kvm_intel kvm evdev psmouse serio_raw pcspkr iwlwifi snd_hda_intel snd_hda_controller lpc_ich mfd_core snd_hda_codec i915 snd_hwdep snd_pcm cfg80211 snd_timer snd soundcore drm_kms_helper asus_laptop sparse_keymap drm rfkill i2c_algo_bit input_polldev shpchp tpm_tis i2c_core tpm wmi acpi_cpufreq ac video battery processor button loop fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 xts gf128mul algif_skcipher af_alg dm_crypt dm_mod sg sd_mod crc_t10dif crct10dif_generic crct10dif_common ahci libahci libata scsi_mod atl1c thermal thermal_sys uhci_hcd ehci_pci ehci_hcd usbcore usb_common
 [ 1454.737009] CPU: 1 PID: 4958 Comm: bash Tainted: G           O  3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u3
 [ 1454.737009] Hardware name: ASUSTeK Computer Inc.         UL30VT              /UL30VT    , BIOS 211     07/14/2010
 [ 1454.737009] task: ffff880074cde1d0 ti: ffff8800b7620000 task.ti: ffff8800b7620000
 [ 1454.737009] RIP: 0010:[<ffffffffa0911035>]  [<ffffffffa0911035>] dis_dev_get+0x15/0x40 [bbswitch]
 [ 1454.737009] RSP: 0018:ffff8800b7623ed0  EFLAGS: 00010206
 [ 1454.737009] RAX: 000000003d5f8000 RBX: 0000000000000003 RCX: 0000000000000000
 [ 1454.737009] RDX: fffffffffffffff2 RSI: 00000000023a540b RDI: ffff8800b7623edb
 [ 1454.737009] RBP: ffff8800b7623ed8 R08: 0000000000000000 R09: ffff880073f70754
 [ 1454.737009] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000003
 [ 1454.737009] R13: ffff8800b7623f58 R14: 0000000000000001 R15: 0000000000000000
 [ 1454.737009] FS:  00007f8e45f8d700(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
 [ 1454.737009] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 1454.737009] CR2: 000000003d5f8038 CR3: 00000000b6b60000 CR4: 00000000000407e0
 [ 1454.737009] Stack:
 [ 1454.737009]  ffffffffa0911576 ffff8800720a4e4f 00000000e0a096d2 ffff880073f70700
 [ 1454.737009]  00000000023a5408 ffffffff812068e9 ffff8800b7623f58 ffff880072c10e00
 [ 1454.737009]  ffffffff811a8252 ffff8800b7b27900 ffff880072c10e00 ffff880072c10e00
 [ 1454.737009] Call Trace:
 [ 1454.737009]  [<ffffffffa0911576>] ? bbswitch_proc_write+0x46/0xac [bbswitch]
 [ 1454.737009]  [<ffffffff812068e9>] ? proc_reg_write+0x39/0x70
 [ 1454.737009]  [<ffffffff811a8252>] ? vfs_write+0xb2/0x1f0
 [ 1454.737009]  [<ffffffff811a8d92>] ? SyS_write+0x42/0xa0
 [ 1454.737009]  [<ffffffff8151158d>] ? system_call_fast_compare_end+0x10/0x15
 [ 1454.737009] Code: 48 c7 c6 e0 10 91 a0 e9 ca 8a 8b e0 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 05 fc 23 00 00 48 8b 40 10 48 85 c0 74 1b <48> 8b 78 38 48 85 ff 74 12 48 81 c7 98 00 00 00 be 04 00 00 00 
 [ 1454.854943] RIP  [<ffffffffa0911035>] dis_dev_get+0x15/0x40 [bbswitch]
 [ 1454.854943]  RSP <ffff8800b7623ed0>
 [ 1454.854943] CR2: 000000003d5f8038
 [ 1454.867885] ---[ end trace 544df569cd2bf62f ]---

I didnt switch kernel. I am running kernel that comes with debian 8 stable.

hamacekh commented 9 years ago

The kernel comes from Debian 8 stable from here: https://packages.debian.org/jessie/linux-image-3.16.0-4-amd64

piesu commented 9 years ago

@hamacekh I have same issue on same hardware. 3.14 (or 3.15 as I wasn't testing 3.15, 3.14 is working fine, 3.16 crashes) seems to be last version for which bbswitch is working correctly. I wonder if this isn't connected with https://github.com/Bumblebee-Project/bbswitch/issues/112 as same kernel version seems to broke usage.

ArchangeGabriel commented 8 years ago

@piesu Could you possibly try to bisect the issue?

piesu commented 8 years ago

@ArchangeGabriel I could try in week or so. You want me to compile different versions from git, right?

ArchangeGabriel commented 8 years ago

Yes, but of the kernel, and probably around https://github.com/torvalds/linux/commit/faae404ebdc6bba744919d82e64c16448eb24a36.

git has a bisect tool that help you figure out which commit breaks it for you. For instance, the workflow would be to clone the kernel tree at https://github.com/torvalds/linux/commit/2447a5a8ea1c31b036ac482868cb16448d12c59a, compile it and the headers, compile bbswitch against it and see if it works.

Alternatively (but not guaranteed to work), you can just remove the added line from the code by adding a patch in the default kernel package of your distro, compile it and see if it works.

Lekensteyn commented 8 years ago

Please try applying this patch to bbswitch.c and report results: https://github.com/Bumblebee-Project/bbswitch/issues/100#issuecomment-219879348