Closed shipilev closed 8 years ago
I can now update: the crash happens when I use a keyboard shortcut for turning Wifi on/off when bbswitch is OFF. Following up on that route, this reliably crashes:
/proc/acpi/bbswitch
repeatedly (e.g. by using the script below)#!/bin/bash
while true; do
echo ON | sudo tee /proc/acpi/bbswitch
echo OFF | sudo tee /proc/acpi/bbswitch
done
It would mean the ACPI tables are FUBAR'ed? This might also explain why the crash is sporadic otherwise -- e.g. when Wifi adapter wakes up from sleep? Given the reliable reproducer on my system, I would happily test any patch to bbswitch, as well as take advices on additional debugging steps.
It could have multiple causes. I take that you have no (proprietary) nvidia drivers installed that can possibly interfere?
The most likely issue I can think of is a sudden power surge when wifi and nvidia both needs power. It would help if you could get some kernel trace if any (syslog to a different machine or use netconsole).
Yes, no proprietary driver is in action, and even nouveau
had failed to load:
[ 1.068698] nouveau E[ DEVICE][0000:03:00.0] unknown chipset, 0x118010a2
[ 1.068700] nouveau E[ DRM] failed to create 0x80000080, -22
[ 1.068806] nouveau: probe of 0000:03:00.0 failed with error -22
I have only the pieces of dmesg and syslog after a fresh boot, because I soon I start to mess with wifi adapter and bbswitch, my system just reboots. It would help if it was just panicking, but it reboots without any chance for me to catch it. I will try remote syslog after I get back to my home office in two weeks. Meanwhile, here are the dmesg and syslog thingies, there are some ACPI errors in there: http://shipilev.net/stuff/bbswitch/1/dmesg http://shipilev.net/stuff/bbswitch/1/syslog
I did a few more experiments with my kernel boot options, and it seems that using pcie_aspm=force
leads to shutdown in the bbswitch ON/OFF + Wifi ON/OFF above. Let me see if my system is stable without this kernel option.
It would seem forcing ASPM was a bad idea to begin with:
$ sudo lspci -vv | grep ASPM.*abled\;
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled+ CommClk+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
...and there is no observable power saving with acpitool -B
when it is forced.
Thanks, it is useful to know that pcie_aspm=force
may cause this kind of issues.
Yes, pcie_aspm=force
can leads to severe issues, and moreover shouldn’t be needed for anyone with not outdated kernels.
Hi, I have Asus UX32LN, which seems to have NVidia 840M:
03:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 840M] (rev ff)
Running Xubuntu 12.04 with:
Linux shade-laptop 3.13.0-36-generic #63-Ubuntu SMP Wed Sep 3 21:30:07 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
...loading bbswitch via
/etc/modules
, and forcingload_state=0
through/etc/modprobe.d/
.bbswitch really helps to cut down the power consumption there (my estimate it saves around 3W), but it seems to sporadically crash the system. When the crash happens, it happens at least on two consecutive boots. Today, it happened >10 boots in the row. After the kernel boots, the crash happens within ~5-20 seconds. Sometimes I see a snow-crash shortly before system shuts down, sometimes it just shuts down silently. After I managed to comment out
load_state=0
from modprobe.d, the crash carousel stopped.Remarkably, after I echoed "OFF" to
/proc/acpi/bbswitch
, the NVidia card seem to be turned off nicely, as far as I can tell from power consumption, and no crash was observed for at least 30 minutes now. I'll keep a watch on this. Any pointers how to debug this issue would be appreciated. The crash leaves no trace in the system logs...For the record, my kernel options do stuff with ACPI:
quiet splash pcie_aspm=force drm.vblankoffdelay=1 i915.semaphores=1 acpi_osi=
, which may be contributing to this?