QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
541 stars 48 forks source link

iwlwifi sometimes needs reload on suspend with kernel 4.9 #3008

Closed dmoerner closed 1 month ago

dmoerner commented 7 years ago

Qubes OS version (e.g., R3.2):

R3.2, with kernel-4.9.35-19.pvops.qubes

Affected TemplateVMs (e.g., fedora-23, if applicable):

fedora-25

After today's upgrade to kernel-4.9 from kernel-4.4, I sometimes lose wireless on suspend. I have an Intel AC 7265 chip. I need to unload and reload the module in sys-net to get wireless back. I see the following in dmesg after suspend:

[ 4434.400132] iwlwifi 0000:00:01.0: Failed to load firmware chunk! [ 4434.400169] iwlwifi 0000:00:01.0: Could not load the [0] uCode section [ 4434.400211] iwlwifi 0000:00:01.0: Failed to start INIT ucode: -110 [ 4434.400234] iwlwifi 0000:00:01.0: Failed to run INIT ucode: -110

This does not always seem to occur, but I can't isolate why it fails.

marmarek commented 7 years ago

I can reproduce it on Qubes 4.0, with 4.9.35 in sys-net. For now the workaround is to add the driver (iwlmvm in my case) to /rw/config/suspend-module-blacklist

micahflee commented 7 years ago

I think I'm having this same problem. I haven't tried blacklisting iwlmvm yet though. Here's what happens after waking up from suspend:

[user@sys-net ~]$ iwconfig wlp0s1 
wlp0s1    IEEE 802.11  ESSID:off/any  
          Mode:Managed  Access Point: Not-Associated   Tx-Power=22 dBm   
          Retry short limit:7   RTS thr:off   Fragment thr:off
          Power Management:on

[user@sys-net ~]$ iwlist wlp0s1 scan
wlp0s1    Failed to read scan data : Network is down

[user@sys-net ~]$ ifconfig wlp0s1
wlp0s1: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 0a:16:47:09:50:23  txqueuelen 1000  (Ethernet)
        RX packets 473208  bytes 520017957 (495.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 290478  bytes 55321101 (52.7 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[user@sys-net ~]$ sudo ifconfig wlp0s1 up
SIOCSIFFLAGS: Connection timed out
[user@sys-net ~]$ sudo service network restart
Restarting network (via systemctl):                        [  OK  ]
[user@sys-net ~]$ sudo ifconfig wlp0s1 up
SIOCSIFFLAGS: Connection timed out
dmoerner commented 7 years ago

On Thu, Aug 17, 2017 at 1:08 PM, Micah Lee notifications@github.com wrote:

I think I'm having this same problem. I haven't tried blacklisting iwlmvm yet though. Here's what happens after waking up from suspend:

Blacklisting iwlmvm has solved the problem for me.

ghost commented 7 years ago

I'm also having the same problem but using a debian-9 template for sys-net.

andrewdavidwong commented 7 years ago

Blacklisting iwlmvm also fixes it for me.

DrWhax commented 7 years ago

I still have the same problem even though I blacklisted iwlwifi.

ghost commented 7 years ago

@DrWhax I would double-check this page: https://www.qubes-os.org/doc/wireless-troubleshooting/

My system needed iwldvm blacklisted instead of iwlmvm. It appears to be working well so far.

radicaldrew commented 7 years ago

Thanks adding the driver iwlmvm to /rw/config/suspend-module-blacklist as marmarek has said worked for me!

tasket commented 7 years ago

@DrWhax Also added iwldvm in addition to iwlwifi and that worked for me, but not consistently.

DrWhax commented 7 years ago

@tasket this works for me 1 out of 10 times. It probably is easier if I just buy an Atheros card since it's getting really unworkable.

tasket commented 7 years ago

@DrWhax I had better luck with both modules listed, iwldvm first.

andrewdavidwong commented 5 years ago

This issue is being closed because:

If anyone believes that this issue should be reopened, please let us know in a comment here.

linse commented 5 years ago

Hello, I have this issue in Qubes 4.0 on a Thinkpad X1 Carbon with iwlwifi.

juodumas commented 4 years ago

I am having this issue in Qubes 4.0 on DELL Latitude 7480. sys-net is running Debian and iwlwifi module sometimes fails to load after wake up from suspend:

[  849.046817] e1000e: ens7 NIC Link is Down
[  849.065192] wls6: deauthenticating from ec:d0:9f:16:39:4c by local choice (Reason: 3=DEAUTH_LEAVING)
[  849.174287] e1000e 0000:00:07.0 ens7: removed PHC
[  849.207646] ehci-pci 0000:00:04.0: remove, state 1
[  849.207670] usb usb1: USB disconnect, device number 1
[  849.207683] usb 1-1: USB disconnect, device number 2
[  849.221168] ehci-pci 0000:00:04.0: USB bus 1 deregistered
[  850.061950] Freezing user space processes ... (elapsed 0.000 seconds) done.
[  850.062913] OOM killer disabled.
[  850.062922] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
[  850.065705] suspending xenstore...
[  853.544322] Xen Platform PCI: I/O protocol version 1
[  850.077520] xen:grant_table: Grant tables using version 1 layout
[  850.231234] OOM killer enabled.
[  850.231249] Restarting tasks ... done.
[  850.299384] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[  850.303243] ehci-pci: EHCI PCI platform driver
[  850.319443] ehci-pci 0000:00:04.0: EHCI Host Controller
[  850.320125] ehci-pci 0000:00:04.0: new USB bus registered, assigned bus number 1
[  850.323647] ehci-pci 0000:00:04.0: irq 35, io mem 0xf2047000
[  850.333057] ehci-pci 0000:00:04.0: USB 2.0 started, EHCI 1.00
[  850.333320] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 4.19
[  850.333343] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  850.333363] usb usb1: Product: EHCI Host Controller
[  850.333377] usb usb1: Manufacturer: Linux 4.19.146-1.pvops.qubes.x86_64 ehci_hcd
[  850.333397] usb usb1: SerialNumber: 0000:00:04.0
[  850.333799] hub 1-0:1.0: USB hub found
[  850.333875] hub 1-0:1.0: 6 ports detected
[  850.366910] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[  850.367575] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[  850.367610] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[  850.367634] cfg80211: failed to load regulatory.db
[  850.399375] Intel(R) Wireless WiFi driver for Linux
[  850.399392] Copyright(c) 2003- 2015 Intel Corporation
[  850.403054] modprobe: page allocation failure: order:4, mode:0x60c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
[  850.403083] modprobe cpuset=/ mems_allowed=0
[  850.403101] CPU: 0 PID: 4282 Comm: modprobe Tainted: G           O      4.19.146-1.pvops.qubes.x86_64 #1
[  850.403140] Hardware name: Xen HVM domU, BIOS 4.8.5-23.fc25 09/22/2020
[  850.403157] Call Trace:
[  850.403172]  dump_stack+0x66/0x8b
[  850.403185]  warn_alloc+0xfc/0x190
[  850.403197]  __alloc_pages_slowpath+0xd87/0xe00
[  850.403211]  ? __switch_to_asm+0x41/0x70
[  850.403222]  ? __switch_to_asm+0x41/0x70
[  850.403233]  ? __switch_to_asm+0x41/0x70
[  850.403245]  __alloc_pages_nodemask+0x2b4/0x2f0
[  850.403271]  ? iwl_trans_alloc+0x2a/0xd0 [iwlwifi]
[  850.403285]  kmalloc_large_node+0x47/0xb0
[  850.403297]  __kmalloc_node_track_caller+0x1ff/0x290
[  850.403312]  devm_kmalloc+0x28/0x70
[  850.403334]  iwl_trans_alloc+0x2a/0xd0 [iwlwifi]
[  850.403355]  iwl_trans_pcie_alloc+0x69/0xd90 [iwlwifi]
[  850.403376]  iwl_pci_probe+0x23/0x200 [iwlwifi]
[  850.403391]  local_pci_probe+0x44/0xa0
[  850.403401]  ? _cond_resched+0x16/0x40
[  850.403412]  pci_device_probe+0x112/0x1c0
[  850.403424]  really_probe+0x244/0x400
[  850.403435]  driver_probe_device+0x10b/0x130
[  850.403449]  __driver_attach+0x119/0x120
[  850.403459]  ? driver_probe_device+0x130/0x130
[  850.403473]  bus_for_each_dev+0x67/0xc0
[  850.403484]  ? klist_add_tail+0x3b/0x70
[  850.403495]  bus_add_driver+0x16a/0x260
[  850.403506]  driver_register+0x5b/0xe0
[  850.403518]  ? 0xffffffffc04f2000
[  850.403534]  iwl_pci_register_driver+0x20/0x40 [iwlwifi]
[  850.403547]  ? 0xffffffffc04f2000
[  850.403558]  do_one_initcall+0x4d/0x1d6
[  850.403570]  ? free_unref_page_commit+0x9f/0x120
[  850.403583]  ? _cond_resched+0x16/0x40
[  850.403594]  ? kmem_cache_alloc_trace+0x169/0x1e0
[  850.403608]  do_init_module+0x5b/0x20e
[  850.403620]  load_module+0x1bb9/0x1fc0
[  850.403632]  ? ima_post_read_file+0xe2/0x120
[  850.403647]  ? __do_sys_finit_module+0xd2/0x100
[  850.403661]  __do_sys_finit_module+0xd2/0x100
[  850.403675]  do_syscall_64+0x5b/0x190
[  850.403687]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  850.403701] RIP: 0033:0x79ace4d47f59
[  850.403712] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48
[  850.403758] RSP: 002b:00007ffc88772b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  850.403778] RAX: ffffffffffffffda RBX: 0000652099ccff60 RCX: 000079ace4d47f59
[  850.403798] RDX: 0000000000000000 RSI: 0000652098ca73f0 RDI: 0000000000000005
[  850.403817] RBP: 0000652098ca73f0 R08: 0000000000000000 R09: 0000000000000000
[  850.429446] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000000000
[  850.429467] R13: 0000652099ccfef0 R14: 0000000000040000 R15: 0000652099ccff60
[  850.429531] Mem-Info:
[  850.429541] active_anon:5321 inactive_anon:5934 isolated_anon:4
                active_file:6706 inactive_file:4736 isolated_file:0
                unevictable:1966 dirty:25 writeback:0 unstable:0
                slab_reclaimable:4750 slab_unreclaimable:6012
                mapped:6888 shmem:747 pagetables:830 bounce:0
                free:1442 free_pcp:0 free_cma:0
[  850.429631] Node 0 active_anon:21284kB inactive_anon:23736kB active_file:26824kB inactive_file:18944kB unevictable:7864kB isolated(anon):16kB isolated(file):0kB mapped:27552kB dirty:100kB writeback:0kB shmem:2988kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[  850.429712] Node 0 DMA free:1032kB min:144kB low:180kB high:216kB active_anon:2264kB inactive_anon:2380kB active_file:1032kB inactive_file:1756kB unevictable:552kB writepending:0kB present:15992kB managed:15908kB mlocked:552kB kernel_stack:48kB pagetables:436kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  850.429789] lowmem_reserve[]: 0 165 165 165 165
[  850.429811] Node 0 DMA32 free:4736kB min:1576kB low:1968kB high:2360kB active_anon:19020kB inactive_anon:21352kB active_file:25800kB inactive_file:17188kB unevictable:7312kB writepending:100kB present:229372kB managed:167244kB mlocked:7312kB kernel_stack:2496kB pagetables:2884kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  850.429890] lowmem_reserve[]: 0 0 0 0 0
[  850.429910] Node 0 DMA: 68*4kB (UME) 55*8kB (UM) 10*16kB (UME) 5*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1032kB
[  850.429953] Node 0 DMA32: 175*4kB (UME) 110*8kB (UM) 52*16kB (UEH) 11*32kB (UEH) 1*64kB (H) 1*128kB (H) 1*256kB (H) 1*512kB (H) 1*1024kB (H) 0*2048kB 0*4096kB = 4748kB
[  850.430002] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  850.430033] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  850.430083] 13438 total pagecache pages
[  850.430093] 1246 pages in swap cache
[  850.430103] Swap cache stats: add 33113, delete 31867, find 19206/23111
[  850.430118] Free swap  = 984316kB
[  850.430127] Total swap = 1048572kB
[  850.430137] 61341 pages RAM
[  850.430143] 0 pages HighMem/MovableOnly
[  850.430152] 15553 pages reserved
[  850.430161] 0 pages cma reserved
[  850.430170] 0 pages hwpoisoned
[  850.432322] iwlwifi: probe of 0000:00:06.0 failed with error -12
[  850.451757] prepare-suspend: page allocation failure: order:4, mode:0x60c0c0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
[  850.451791] prepare-suspend cpuset=/ mems_allowed=0
[  850.451807] CPU: 0 PID: 4260 Comm: prepare-suspend Tainted: G           O      4.19.146-1.pvops.qubes.x86_64 #1
[  850.451834] Hardware name: Xen HVM domU, BIOS 4.8.5-23.fc25 09/22/2020
[  850.451850] Call Trace:
[  850.451863]  dump_stack+0x66/0x8b
[  850.451875]  warn_alloc+0xfc/0x190
[  850.451886]  __alloc_pages_slowpath+0xd87/0xe00
[  850.451901]  ? __switch_to_asm+0x41/0x70
[  850.451912]  ? __switch_to_asm+0x41/0x70
[  850.451923]  ? __switch_to_asm+0x41/0x70
[  850.451935]  __alloc_pages_nodemask+0x2b4/0x2f0
[  850.451955]  ? iwl_trans_alloc+0x2a/0xd0 [iwlwifi]
[  850.451969]  kmalloc_large_node+0x47/0xb0
[  850.451980]  __kmalloc_node_track_caller+0x1ff/0x290
[  850.451995]  devm_kmalloc+0x28/0x70
[  850.452009]  iwl_trans_alloc+0x2a/0xd0 [iwlwifi]
[  850.452027]  iwl_trans_pcie_alloc+0x69/0xd90 [iwlwifi]
[  850.452044]  iwl_pci_probe+0x23/0x200 [iwlwifi]
[  850.452058]  local_pci_probe+0x44/0xa0
[  850.452070]  ? _cond_resched+0x16/0x40
[  850.452081]  pci_device_probe+0x112/0x1c0
[  850.452093]  really_probe+0x244/0x400
[  850.452104]  driver_probe_device+0x10b/0x130
[  850.452118]  bind_store+0x10e/0x160
[  850.452129]  kernfs_fop_write+0x10f/0x190
[  850.629499]  __vfs_write+0x36/0x1a0
[  850.629513]  ? handle_mm_fault+0xfc/0x210
[  850.629524]  ? _cond_resched+0x16/0x40
[  850.629536]  vfs_write+0xb0/0x190
[  850.629547]  ksys_write+0x5a/0xd0
[  850.629559]  do_syscall_64+0x5b/0x190
[  850.629571]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  850.629586] RIP: 0033:0x7a2ccfe84504
[  850.629598] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48 8d 05 f9 61 0d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
[  850.629646] RSP: 002b:00007ffe803a2588 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  850.629667] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007a2ccfe84504
[  850.629688] RDX: 000000000000000d RSI: 00005a96f9609bd0 RDI: 0000000000000001
[  850.629708] RBP: 00005a96f9609bd0 R08: 000000000000000a R09: 00007a2ccff55cb0
[  850.629729] R10: 000000000000000a R11: 0000000000000246 R12: 00007a2ccff56760
[  850.629749] R13: 000000000000000d R14: 00007a2ccff51760 R15: 000000000000000d
[  850.631732] iwlwifi: probe of 0000:00:06.0 failed with error -12

Doesn't seem to happen when wifi is disconnected before suspend. Tried running /usr/lib/qubes/prepare-suspend suspend; sleep 1; /usr/lib/qubes/prepare-suspend restore, but it doesn't help. Power cycling sys-net resolves the issue.

I am running a script from dom0 to automatically restart sys-net if my wireless interface wls6 doesn't appear after system wake up:

$ cat /etc/systemd/system/suspend.target.wants/maybe-restart-sys-net.service 
[Unit]
After=suspend.target

[Service]
Type=simple
ExecStart=/root/maybe-restart-sys-net.sh

[Install]
WantedBy=suspend.target

$ cat /root/maybe-restart-sys-net.sh                                        
#!/bin/sh -e

if ! qvm-check -q --running sys-net; then
  logger -t wireless-check 'Skipping: sys-net is not running'
  exit 0
fi

if qvm-run -p sys-net 'logger -t wireless-check "Looking for wireless interface..."; for i in 0 1 2; do if test -d /sys/class/net/wls6; then logger -t wireless-check "Wireless interface detected in $i s"; exit 0; fi; sleep 1; done; logger -t wireless-check "No wireless interface detected in $i s"; exit 1'; then
  logger -t wireless-check "sys-net wireless ok"
else
  logger -t wireless-check "sys-net wireless not found, restarting"
  notify-send -c sys-net -t 5000 sys-net "Restarting broken sys-net after suspend..."
  qvm-kill -q sys-net
  qvm-start -q sys-net
fi
qtpies commented 2 years ago

For me this was fixed by adding both iwlwifi and iwlmvm to /rw/config/suspend-module-blacklist in sys-net. Find drivers with lsmod.

For me this issue can be closed as there is a workaround and it is not a Qubes problem.

andrewdavidwong commented 2 years ago

@marmarek, do you agree with closing this?

DemiMarie commented 2 years ago

@andrewdavidwong I don’t think this should be closed. If the drivers need to be blocklisted then that should be part of the default configuration.

marmarek commented 2 years ago

For me this was fixed by adding both iwlwifi and iwlmvm to /rw/config/suspend-module-blacklist in sys-net. Find drivers with lsmod.

iwlmvm is already on the default blacklist. And it's unloaded via modprobe -r, which should unload iwlwifi automatically too. @qtpies can you confirm you really needed to add those modules to /rw/config/suspend-module-blacklist, even though they are in /etc/qubes-suspend-module-blacklist already?

qtpies commented 2 years ago

@marmarek good that you asked. Yesterday after adding both iwlwifi and iwlmvm to /rw/config/suspend-module-blacklist in sys-net, the wifi device really stayed available over multiple suspends. I was quite happy because before, I always needed to do a user@sys-net:~$ systemctl restart iwd after suspend.

Today somehow the wifi device again disappears over suspend. I tested over multiple suspends with and without iwlwifi and iwlmvm in /rw/config/suspend-module-blacklist, so it actually makes no difference.

Then out of interest I tried removing iwlwifi and iwlmvm in /etc/qubes-suspend-module-blacklist as well, to see if it that is making a difference. Now the wifi device is staying available after suspend (tested 3 times) without being mentioned in either of the blacklist files. Putting iwlwifi and iwlmvm back in /etc/qubes-suspend-module-blacklist makes the device disappear over suspend again.

So actually in my case the device being in a blacklists is causing the problem? And the device being being mentioned in two blacklists, maybe undos the blacklisting (like negative+negative=positive? Or something totally different that I am not aware of.

marmarek commented 2 years ago

user@sys-net:~$ systemctl restart iwd

iwd? does it conflict with NetworkManager? the default suspend script does try to stop/start it, unless you disable it via qvm-service

DemiMarie commented 2 years ago

user@sys-net:~$ systemctl restart iwd

iwd? does it conflict with NetworkManager? the default suspend script does try to stop/start it, unless you disable it via qvm-service

iwd is a replacement for wpa_supplicant, and NetworkManager can call into it.

qtpies commented 2 years ago

user@sys-net:~$ systemctl restart iwd

iwd? does it conflict with NetworkManager? the default suspend script does try to stop/start it, unless you disable it via qvm-service

iwd is a replacement for wpa_supplicant, and NetworkManager can call into it.

systemctl restart iwd works, wifi re-appears in the NetworkManager applet and in $ iwctl device list. Sometimes the applet is also crashed on resume, then I do pkill nm-applet; nm-applet.

andrewdavidwong commented 1 year ago

Is this still a problem in 4.1?