Open andrewdavidwong opened 5 years ago
Posted bounty ($100): https://www.bountysource.com/issues/67959138-sys-net-dies-on-resume-from-suspend
@andrewdavidwong @brycepg I feel this must be hardware specific, since I dont see it on any of the machines that I still have access to running 3.2.1. It might be helpful if you could post details of the hardware you are using, and specifically the NIC. I'm assuming that you see this using both Fedora and Debian templates? If you haven't checked please do so.
@unman
Laptop model: Lenovo Thinkpad T530 Wireless adapter: Intel Centrino Advanced-N 6205 [Taylor Peak] (rev 34) I've only tested this issue on my fedora-28 template. Will try with debian-9 CPU: i5-3210m (no IOMMU)
EDIT: Strangely I cannot repro right now with a quick suspend. Will try out with longer suspend times.
I should have clarified that this only happens occasionally, so it would be difficult to reproduce without suspending/resuming many times. I'll update the issue to reflect this.
Lenovo Thinkpad T450s:
Xen: 4.6.6
Kernel: 4.14.74-1
RAM: 20194 Mb
CPU:
Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
Chipset:
Intel Corporation Broadwell-U Host Bridge -OPI [8086:1604] (rev 09)
VGA:
Intel Corporation HD Graphics 5500 [8086:1616] (rev 09) (prog-if 00 [VGA controller])
Net:
Intel Corporation Ethernet Connection (3) I218-LM (rev 03)
Intel Corporation Wireless 7265 (rev 59)
same issue, thinkpad x220, qubes 4.0
Maybe a script that shut down and restart all net-vm could be a temporary fix ? ( its pretty long and painful to do it manually )
This happened to me recently after re-doing some templates and VMs. What I discovered was that my wifi module suspend settings were gone.... I forgot to re-add them.
I agree with unman this is a hardware-specific issue. OTOH it would be nice if Qubes had some way of automatically populating this module information.
Here is what I use in /rw/config/suspend-module-blacklist for an Intel "Ultimate-N" card:
iwldvm
iwlwifi
In my case, sys-net
will always die on resume from suspend, if another VM is in a transient state (i.e. qube manager displaying the VM with a yellow dot). This is in particular true for the following two cases which others may be able to reproduce as well:
The result in my case is the following output in sys-net:
[ 372.065513] ath10k_pci 0000:00:06.0: failed to wake target for read32 at 0x0003a028: -110
[ 372.657769] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[ 372.657817] clocksource: 'xen' wd_now: 27c2a46c622 wd_last: 27c0c3c4a2a mask: ffffffffffffffff
[ 372.657856] clocksource: 'tsc' cs_now: ffffff9953059778 cs_last: ffffffecec397004 mask: ffffffffffffffff
[ 372.657902] tsc: Marking TSC unstable due to clocksource watchdog
Please also note that this may occasionally occur even for VMs that are not connected to a NetVM.
In the case of sys-usb
the machine may even entirely fail to resume, and instead reboots.
I hope that this is relevant for this issue. If not please indicate how I can proceed. For the record, essentially the machine (DELL XPS 13 9360) is running R4.0.
@aslfv your case looks like #3489
I'm experiencing this issue with a Librem 13 laptop and my easiest solution is to kill the sys-net
qube and then start it up again. If I do a simple restart I am forced to restart all dependent qubes, which is really inconvenient. So I opt for the kill method, which has worked for me every time. Is there anything I can provide to help with debugging this?
Myself and several peers all have librem15 and librem13 devices and sys-net dies often on resume exactly as @quantumpacket describes.
This is a serious PITA we would love help with.
As per the discussion on qubes-users ("[qubes-users] sys-net keeps dying") here is the dmesg from a netvm (formerly fedora-29, upgraded to fedora-30) as it begins to exhibit the described behaviour:
[ 9266.512872] IPv6: ADDRCONF(NETDEV_UP): ens7: link is not ready
[ 9266.770001] IPv6: ADDRCONF(NETDEV_UP): ens7: link is not ready
[ 9266.792354] IPv6: ADDRCONF(NETDEV_UP): wls6: link is not ready
[ 9268.821814] iwlwifi 0000:00:06.0: Error sending REPLY_SCAN_ABORT_CMD: time out after 2000ms.
[ 9268.821849] iwlwifi 0000:00:06.0: Current CMD queue read_ptr 29 write_ptr 30
[ 9268.821922] iwlwifi 0000:00:06.0: Loaded firmware version: 18.168.6.1
[ 9268.822451] iwlwifi 0000:00:06.0: 0x00000000 | OK
[ 9268.822477] iwlwifi 0000:00:06.0: 0x00000000 | uPc
[ 9268.822494] iwlwifi 0000:00:06.0: 0x00000000 | branchlink1
[ 9268.822510] iwlwifi 0000:00:06.0: 0x00000000 | branchlink2
[ 9268.822568] iwlwifi 0000:00:06.0: 0x00000000 | interruptlink1
[ 9268.822590] iwlwifi 0000:00:06.0: 0x00000000 | interruptlink2
[ 9268.822611] iwlwifi 0000:00:06.0: 0x00000000 | data1
[ 9268.822629] iwlwifi 0000:00:06.0: 0x00000000 | data2
[ 9268.822655] iwlwifi 0000:00:06.0: 0x00000000 | line
[ 9268.822672] iwlwifi 0000:00:06.0: 0x00000000 | beacon time
[ 9268.822690] iwlwifi 0000:00:06.0: 0x00000000 | tsf low
[ 9268.822716] iwlwifi 0000:00:06.0: 0x00000000 | tsf hi
[ 9268.822734] iwlwifi 0000:00:06.0: 0x00000000 | time gp1
[ 9268.822760] iwlwifi 0000:00:06.0: 0x00000000 | time gp2
[ 9268.822777] iwlwifi 0000:00:06.0: 0x00000000 | time gp3
[ 9268.822795] iwlwifi 0000:00:06.0: 0x00000000 | uCode version
[ 9268.822825] iwlwifi 0000:00:06.0: 0x00000000 | hw version
[ 9268.822852] iwlwifi 0000:00:06.0: 0x00000000 | board version
[ 9268.822873] iwlwifi 0000:00:06.0: 0x00000000 | hcmd
[ 9268.822899] iwlwifi 0000:00:06.0: 0x00000000 | isr0
[ 9268.822916] iwlwifi 0000:00:06.0: 0x00000000 | isr1
[ 9268.822942] iwlwifi 0000:00:06.0: 0x00000000 | isr2
[ 9268.822960] iwlwifi 0000:00:06.0: 0x00000000 | isr3
[ 9268.822986] iwlwifi 0000:00:06.0: 0x00000000 | isr4
[ 9268.823004] iwlwifi 0000:00:06.0: 0x00000000 | isr_pref
[ 9268.823030] iwlwifi 0000:00:06.0: 0x00000000 | wait_event
[ 9268.823048] iwlwifi 0000:00:06.0: 0x00000000 | l2p_control
[ 9268.823075] iwlwifi 0000:00:06.0: 0x00000000 | l2p_duration
[ 9268.823093] iwlwifi 0000:00:06.0: 0x00000000 | l2p_mhvalid
[ 9268.823119] iwlwifi 0000:00:06.0: 0x00000000 | l2p_addr_match
[ 9268.823149] iwlwifi 0000:00:06.0: 0x00000000 | lmpm_pmg_sel
[ 9268.823167] iwlwifi 0000:00:06.0: 0x00000000 | timestamp
[ 9268.823184] iwlwifi 0000:00:06.0: 0x00000000 | flow_handler
[ 9268.823413] iwlwifi 0000:00:06.0: Start IWL Event Log Dump: nothing in log
[ 9268.823453] iwlwifi 0000:00:06.0: Command REPLY_RXON failed: FW Error
[ 9268.823485] iwlwifi 0000:00:06.0: Error clearing ASSOC_MSK on BSS (-5)
[ 9268.835622] ieee80211 phy0: Hardware restart was requested
[ 9268.849978] iwlwifi 0000:00:06.0: Radio type=0x1-0x2-0x0
[ 9269.152998] iwlwifi 0000:00:06.0: Radio type=0x1-0x2-0x0
[ 9269.240518] IPv6: ADDRCONF(NETDEV_UP): wls6: link is not ready
[ 9269.256202] iwlwifi 0000:00:06.0: Radio type=0x1-0x2-0x0
[ 9270.288593] audit: type=1131 audit(1575954363.530:135): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-rfkill comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 9274.645527] iwlwifi 0000:00:06.0: Failed to load firmware chunk!
[ 9274.645558] iwlwifi 0000:00:06.0: iwlwifi transaction failed, dumping registers
[ 9274.645584] iwlwifi 0000:00:06.0: iwlwifi device config registers:
[ 9274.669420] iwlwifi 0000:00:06.0: 00000000: 00858086 00100406 02800034 00000000 f2044004 00000000 00000000 00000000
[ 9274.669467] iwlwifi 0000:00:06.0: 00000020: 00000000 00000000 00000000 13118086 00000000 000000c8 00000000 0000010b
[ 9274.669505] iwlwifi 0000:00:06.0: iwlwifi device memory mapped registers:
[ 9274.669608] iwlwifi 0000:00:06.0: 00000000: 00488700 00000040 08000000 00000000 00000001 00000000 00000030 00000000
[ 9274.669652] iwlwifi 0000:00:06.0: 00000020: 00000001 080403c5 000000b0 00000000 90000001 00030001 80008040 00080044
[ 9274.669694] iwlwifi 0000:00:06.0: Could not load the [0] uCode section
[ 9274.684532] iwlwifi 0000:00:06.0: Failed to run INIT ucode: -110
[ 9274.684563] iwlwifi 0000:00:06.0: Fw not loaded - dropping CMD: 81
[ 9274.684628] iwlwifi 0000:00:06.0: Unable to initialize device.
[ 9274.684650] ------------[ cut here ]------------
[ 9274.684667] Hardware became unavailable during restart.
[ 9274.684732] WARNING: CPU: 1 PID: 1761 at /home/user/rpmbuild/BUILD/kernel-4.19.84/linux-4.19.84/net/mac80211/util.c:1936 ieee80211_reconfig+0x236/0x1140 [mac80211]
[ 9274.684774] Modules linked in: iwldvm iwlwifi mac80211 cfg80211 ehci_pci ehci_hcd xt_nat ccm fuse nft_reject_ipv4 nft_reject nft_ct nf_tables nfnetlink ip6table_raw iptable_raw xen_netback xt_REDIRECT ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c joydev arc4 intel_rapl crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel intel_rapl_perf ttm pcspkr serio_raw e1000e drm_kms_helper ata_generic pata_acpi rfkill drm i2c_piix4 floppy u2mfn(O) xen_gntdev xen_gntalloc xen_blkback xenfs xen_evtchn xen_privcmd overlay xen_blkfront [last unloaded: cfg80211]
[ 9274.684980] CPU: 1 PID: 1761 Comm: kworker/1:1 Tainted: G O 4.19.84-1.pvops.qubes.x86_64 #1
[ 9274.685008] Hardware name: Xen HVM domU, BIOS 4.8.5-12.fc25 11/13/2019
[ 9274.685042] Workqueue: events_freezable ieee80211_restart_work [mac80211]
[ 9274.685081] RIP: 0010:ieee80211_reconfig+0x236/0x1140 [mac80211]
[ 9274.685104] Code: 44 24 07 00 c6 83 a4 04 00 00 00 48 89 df e8 41 af fc ff 85 c0 41 89 c5 0f 84 6e 01 00 00 48 c7 c7 40 9a 71 c0 e8 ea f1 9d f3 <0f> 0b e9 46 fe ff ff 48 89 ef e8 fb f7 01 00 e9 12 ff ff ff c6 83
[ 9274.685157] RSP: 0018:ffffb536410dfe08 EFLAGS: 00010282
[ 9274.685175] RAX: 0000000000000000 RBX: ffff8ab884358760 RCX: 0000000000000006
[ 9274.685199] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff8ab896f168b0
[ 9274.685223] RBP: ffff8ab884358f98 R08: ffffb53640000000 R09: 00000000000002c1
[ 9274.685246] R10: ffff8ab8928f8900 R11: ffffffffb59efe4d R12: ffff8ab8843593d0
[ 9274.687477] R13: 00000000ffffff92 R14: ffff8ab895aded80 R15: ffff8ab8843593d8
[ 9274.687503] FS: 0000000000000000(0000) GS:ffff8ab896f00000(0000) knlGS:0000000000000000
[ 9274.687527] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9274.687548] CR2: 00005be8941db000 CR3: 000000000b20a003 CR4: 00000000001606e0
[ 9274.687574] Call Trace:
[ 9274.687614] ieee80211_restart_work+0xbb/0xe0 [mac80211]
[ 9274.687637] process_one_work+0x191/0x370
[ 9274.687715] worker_thread+0x4f/0x3b0
[ 9274.687730] kthread+0xf8/0x130
[ 9274.687745] ? rescuer_thread+0x340/0x340
[ 9274.687758] ? kthread_create_worker_on_cpu+0x70/0x70
[ 9274.687777] ret_from_fork+0x35/0x40
[ 9274.687793] ---[ end trace 3adece76f5f16d5c ]---
[ 9274.689538] ------------[ cut here ]------------
[ 9274.689560] wls6: Failed check-sdata-in-driver check, flags: 0x0
[ 9274.689625] WARNING: CPU: 1 PID: 1761 at /home/user/rpmbuild/BUILD/kernel-4.19.84/linux-4.19.84/net/mac80211/driver-ops.h:19 drv_remove_interface+0xf3/0x100 [mac80211]
[ 9274.689667] Modules linked in: iwldvm iwlwifi mac80211 cfg80211 ehci_pci ehci_hcd xt_nat ccm fuse nft_reject_ipv4 nft_reject nft_ct nf_tables nfnetlink ip6table_raw iptable_raw xen_netback xt_REDIRECT ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c joydev arc4 intel_rapl crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel intel_rapl_perf ttm pcspkr serio_raw e1000e drm_kms_helper ata_generic pata_acpi rfkill drm i2c_piix4 floppy u2mfn(O) xen_gntdev xen_gntalloc xen_blkback xenfs xen_evtchn xen_privcmd overlay xen_blkfront [last unloaded: cfg80211]
[ 9274.689868] CPU: 1 PID: 1761 Comm: kworker/1:1 Tainted: G W O 4.19.84-1.pvops.qubes.x86_64 #1
[ 9274.689896] Hardware name: Xen HVM domU, BIOS 4.8.5-12.fc25 11/13/2019
[ 9274.689931] Workqueue: events_freezable ieee80211_restart_work [mac80211]
[ 9274.689966] RIP: 0010:drv_remove_interface+0xf3/0x100 [mac80211]
[ 9274.692140] Code: 85 c0 75 e8 5b 5d 41 5c c3 48 8b b5 08 04 00 00 48 81 c5 28 04 00 00 48 c7 c7 20 77 71 c0 48 85 f6 48 0f 44 f5 e8 4d 3d a1 f3 <0f> 0b 5b 5d 41 5c c3 66 0f 1f 44 00 00 0f 1f 44 00 00 41 57 41 56
[ 9274.692198] RSP: 0000:ffffb536410dfc98 EFLAGS: 00010282
[ 9274.692217] RAX: 0000000000000000 RBX: ffff8ab893ba48c0 RCX: 0000000000000006
[ 9274.692242] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff8ab896f168b0
[ 9274.692266] RBP: ffff8ab893ba4ce8 R08: ffffb53640000000 R09: 00000000000002dc
[ 9274.692290] R10: ffffb53640363d60 R11: ffffffffb59efe4d R12: ffff8ab884358760
[ 9274.692315] R13: ffff8ab884358760 R14: ffff8ab884358ef0 R15: ffff8ab893ba53a0
[ 9274.692340] FS: 0000000000000000(0000) GS:ffff8ab896f00000(0000) knlGS:0000000000000000
[ 9274.692364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9274.692384] CR2: 00007f1e53349f90 CR3: 000000000b20a003 CR4: 00000000001606e0
[ 9274.692409] Call Trace:
[ 9274.692447] ieee80211_do_stop+0x4f9/0x860 [mac80211]
[ 9274.692482] ieee80211_stop+0x16/0x20 [mac80211]
[ 9274.692503] __dev_close_many+0xa1/0x110
[ 9274.692517] dev_close_many+0x9f/0x160
[ 9274.692531] dev_close.part.99+0x64/0xa0
[ 9274.692563] cfg80211_shutdown_all_interfaces+0x43/0xd0 [cfg80211]
[ 9274.692601] ieee80211_reconfig+0x8b/0x1140 [mac80211]
[ 9274.692631] ieee80211_restart_work+0xbb/0xe0 [mac80211]
[ 9274.692653] process_one_work+0x191/0x370
[ 9274.692670] worker_thread+0x4f/0x3b0
[ 9274.692686] kthread+0xf8/0x130
[ 9274.692702] ? rescuer_thread+0x340/0x340
[ 9274.692717] ? kthread_create_worker_on_cpu+0x70/0x70
[ 9274.692743] ret_from_fork+0x35/0x40
[ 9274.692759] ---[ end trace 3adece76f5f16d5d ]---
[ 9274.693263] ------------[ cut here ]------------
[ 9274.693314] WARNING: CPU: 1 PID: 1761 at /home/user/rpmbuild/BUILD/kernel-4.19.84/linux-4.19.84/net/mac80211/driver-ops.c:39 drv_stop+0xff/0x110 [mac80211]
[ 9274.693355] Modules linked in: iwldvm iwlwifi mac80211 cfg80211 ehci_pci ehci_hcd xt_nat ccm fuse nft_reject_ipv4 nft_reject nft_ct nf_tables nfnetlink ip6table_raw iptable_raw xen_netback xt_REDIRECT ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c joydev arc4 intel_rapl crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel intel_rapl_perf ttm pcspkr serio_raw e1000e drm_kms_helper ata_generic pata_acpi rfkill drm i2c_piix4 floppy u2mfn(O) xen_gntdev xen_gntalloc xen_blkback xenfs xen_evtchn xen_privcmd overlay xen_blkfront [last unloaded: cfg80211]
[ 9274.698012] CPU: 1 PID: 1761 Comm: kworker/1:1 Tainted: G W O 4.19.84-1.pvops.qubes.x86_64 #1
[ 9274.698044] Hardware name: Xen HVM domU, BIOS 4.8.5-12.fc25 11/13/2019
[ 9274.698088] Workqueue: events_freezable ieee80211_restart_work [mac80211]
[ 9274.698126] RIP: 0010:drv_stop+0xff/0x110 [mac80211]
[ 9274.698147] Code: 48 8b 7d 08 48 83 c5 18 48 89 de e8 5b 16 56 f4 48 8b 45 00 48 85 c0 75 e7 e9 46 ff ff ff 48 c7 c7 a0 76 71 c0 e8 af 2b a7 f3 <0f> 0b 5b 5d c3 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
[ 9274.698203] RSP: 0000:ffffb536410dfca0 EFLAGS: 00010286
[ 9274.698221] RAX: 0000000000000024 RBX: ffff8ab884358760 RCX: 0000000000000000
[ 9274.698245] RDX: 0000000000000000 RSI: ffff8ab896f168b8 RDI: ffff8ab896f168b8
[ 9274.698269] RBP: ffff8ab884358ff8 R08: ffffb53640000000 R09: 00000000000002fd
[ 9274.698293] R10: ffffb536410dfca8 R11: ffffffffb59efe4d R12: ffff8ab884358b90
[ 9274.698317] R13: ffff8ab884358760 R14: ffff8ab884358ef0 R15: ffff8ab893ba53a0
[ 9274.698342] FS: 0000000000000000(0000) GS:ffff8ab896f00000(0000) knlGS:0000000000000000
[ 9274.698367] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9274.698389] CR2: 00007f1e51571548 CR3: 000000000b20a003 CR4: 00000000001606e0
[ 9274.698415] Call Trace:
[ 9274.698451] ieee80211_do_stop+0x4e2/0x860 [mac80211]
[ 9274.698486] ieee80211_stop+0x16/0x20 [mac80211]
[ 9274.698509] __dev_close_many+0xa1/0x110
[ 9274.698526] dev_close_many+0x9f/0x160
[ 9274.698542] dev_close.part.99+0x64/0xa0
[ 9274.700409] cfg80211_shutdown_all_interfaces+0x43/0xd0 [cfg80211]
[ 9274.700458] ieee80211_reconfig+0x8b/0x1140 [mac80211]
[ 9274.700491] ieee80211_restart_work+0xbb/0xe0 [mac80211]
[ 9274.700515] process_one_work+0x191/0x370
[ 9274.700530] worker_thread+0x4f/0x3b0
[ 9274.700544] kthread+0xf8/0x130
[ 9274.700560] ? rescuer_thread+0x340/0x340
[ 9274.700574] ? kthread_create_worker_on_cpu+0x70/0x70
[ 9274.700593] ret_from_fork+0x35/0x40
[ 9274.700609] ---[ end trace 3adece76f5f16d5e ]---
[ 9274.759827] IPv6: ADDRCONF(NETDEV_UP): wls6: link is not ready
Creating a new sys-net does not appear to have fixed the issue, crashes still occur.
I also face this issue sometimes but also with sys-usb. I also have a T450. I will post the the ouput of
sudo journalctl -u qubes-suspend
The next time it occurs.
@andrewdavidwong Could you claim the bounty I made please? Apparently BountySource will take the bounty after 2 years which is 2 months away, and I want it to go to a Qubes member.
I've pretty much never had this issue after getting a new (used) T530 which supports HVM so I could install Qubes 4.0
@andrewdavidwong Could you claim the bounty I made please? Apparently BountySource will take the bounty after 2 years which is 2 months away, and I want it to go to a Qubes member.
Thanks for letting us know, @brycepg. I think the bounty should be turned into a donation to the Qubes OS Project, if possible. @mfc, @MiCh, do you know how to do that?
I've pretty much never had this issue after getting a new (used) T530 which supports HVM so I could install Qubes 4.0
I also have not experienced this bug in a very long time, even on the same hardware as when I filed this report. It looks like the last report was from @w1k1n9cc on May 24. @w1k1n9cc, are you still experiencing this?
My Qubes-PC is not very active at the moment. I will try it until tuesday. Maybe I have some sparse time at the weekend.
I'm running the same hardware as when I first described this bug, and I have not had it happen for a very long time. In my situation the issue seems to be resolved. :+1:
Ok, I'm going to close this as resolved. If anyone is still affected by this issue, please leave a comment, and we'll be happy to reopen this. Thank you.
I still face the issue if I restart i3.
@w1k1n9cc, could you provide a bit more detail? You don't experience sys-net dying just from resuming from suspend, but you do if you also restart i3?
Ah, sorry, this is another issue with i3.
But I still face the behavior mentioned in this issue. Sometimes if my laptop wake up from suspend sys-net or sys-usb or both are stuck. The only way to rescue my system is to kill the specific vm. After that my system is running normally.
On Thu, Oct 01, 2020 at 09:56:05AM -0700, Andrew David Wong wrote:
I've pretty much never had this issue after getting a new (used) T530 which supports HVM so I could install Qubes 4.0
I also have not experienced this bug in a very long time. It looks like the last report was from @w1k1n9cc on May 24. @w1k1n9cc, are you still experiencing this?
I am not the OP, but I am still experiencing this.
Not the OP either, Qubes n00b with a ThinkPad (i7, not sure the model). R4.0. I found myself here searching how to fix it.
Spent a day or two trying to find a work around or any more information. sys-net doesn't die (and I can't restart it even because other vm's use it) but the wifi never comes back after sleep. Have to reboot. Disabling and enabling the wifi with the function key doesn't help either. Thinkpad i7 model 20cjs0hc00.
@newts You can use qvm-shutdown --force
to skip the “Are there connected VMs?” check. This should probably be exposed via Qubes Manager.
Try disabling the /usr/lib/systemd/system/systemd-udevd.service
watchdog in your sys-net
template and please report back if it helps.
I'm back to experiencing this issue in a fresh install of the latest qubesos, which has made me just disable suspend entirely.
Duplicate of #4042?
This is still happening on 4.1
Duplicate of #4042?
This appears to be a duplicate of an existing issue. If so, please comment on the appropriate existing issue instead. If anyone believes this is not really a duplicate, please leave a comment briefly explaining why. We'll be happy to take another look and, if appropriate, reopen this issue. Thank you.
@andrewdavidwong I think the reason everyone came here is that this issue was still open, and the other was closed. Now they are both closed. But the issue is clearly not fixed.
@DemiMarie, you're the one who asked whether this is a duplicate of #4042. Do you no longer have reason to think it is? If so, what are those reasons?
And please don't say "because the other one is closed." That would be a reason to reopen the other issue, not this one!
Whoops!
@andrewdavidwong #4042 affects sys-usb whie this one affects sys-net.
Try disabling the
/usr/lib/systemd/system/systemd-udevd.service
watchdog in yoursys-net
template and please report back if it helps.
I tried disabling this service via systemd stop systemd-udevd && systemd disable systemd-udevd
and it didn't fix the issue. Disabling the service via Qubes VM settings (adding an entry for systemd-udevd
and unchecking the box) also did not help.
Try disabling the
/usr/lib/systemd/system/systemd-udevd.service
watchdog in yoursys-net
template and please report back if it helps.I tried disabling this service via
systemd stop systemd-udevd && systemd disable systemd-udevd
and it didn't fix the issue. Disabling the service via Qubes VM settings (adding an entry forsystemd-udevd
and unchecking the box) also did not help.
Disabling systemd-udevd will break a lot of stuff.
On 5/11/22 20:07, Demi Marie Obenour wrote:
Try disabling the
/usr/lib/systemd/system/systemd-udevd.service
watchdog in yoursys-net
template and please report back if it helps.I tried disabling this service via
systemd stop systemd-udevd && systemd disable systemd-udevd
and it didn't fix the issue. Disabling the service via Qubes VM settings (adding an entry forsystemd-udevd
and unchecking the box) also did not help.Disabling systemd-udevd will break a lot of stuff.
Exactly.
I was talking about the watchdog back then, not the entire service. I.e. remove or comment out the WatchdogSec
line in that file. I was suspecting that a udevd restart caused by the watchdog (which is triggered by clock issues caused by the resume from suspend) would cause havoc.
On Wed, May 11, 2022 at 11:28:33AM -0700, 3hhh wrote:
I was talking about the watchdog back then, not the entire service. I.e. remove or comment out the
WatchdogSec
line in that file. I was suspecting that a udevd restart caused by the watchdog (which is triggered by clock issues caused by the resume from suspend) would cause havoc.
There is no WatchdogSec line. Here is my systemd-udevd.service file:
[Unit]
Description=Rule-based Manager for Device Events and Files
Documentation=man:systemd-udevd.service(8) man:udev(7)
DefaultDependencies=no
After=systemd-sysusers.service systemd-hwdb-update.service
Before=sysinit.target
ConditionPathIsReadWrite=/sys
[Service]
DeviceAllow=block-* rwm
DeviceAllow=char-* rwm
Type=notify
# Note that udev will reset the value internally for its workers
OOMScoreAdjust=-1000
Sockets=systemd-udevd-control.socket systemd-udevd-kernel.socket
Restart=always
RestartSec=0
ExecStart=/usr/lib/systemd/systemd-udevd
ExecReload=udevadm control --reload --timeout 0
KillMode=mixed
TasksMax=infinity
PrivateMounts=yes
ProtectClock=yes
ProtectHostname=yes
MemoryDenyWriteExecute=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6
RestrictRealtime=yes
RestrictSUIDSGID=yes
***@***.*** @module @raw-io
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any
Well, then it's not that on your system.
(Debian has the watchdog btw, Fedora apparently not.)
Still experiencing this problem. In my case, it ONLY affects WiFi, Ethernet connections continue normally. See also #5508
I haven't had this happen to be for a while. And I haven't had it happen to me at all since upgading to 4.1 (Lenovo T530). My sys-net is on fedora-35
I haven't had this happen to be for a while. And I haven't had it happen to me at all since upgading to 4.1 (Lenovo T530). My sys-net is on
fedora-35
Fedora 35 is EOL, FYI
Qubes OS version:
R3.2
Affected component(s):
sys-net
Steps to reproduce the behavior:
Expected behavior:
sys-net
stays on and continues to provide network access normally.Actual behavior:
Occasionally:
sys-net
looks like it's trying to connect for a second, then the entiresys-net
just dies (shows powered off in Qubes Manager).sys-firewall
and AppVMs usingsys-firewall
for network access are still running normally, but of course they don't have network access.sys-net
does not restore network access to these other AppVMs.sys-net
. (Maybe restarting justsys-net
andsys-firewall
would be enough, but in practice it's easier for me just to shut them all down and restart them all.)General notes:
This has been going on for at least a few months. When reading #4657, I saw that it mentioned this problem with
sys-net
in passing. I thought we already had an issue on this (see below) but couldn't find one, so I'm filing this now.Related issues:
I could have sworn we already had an issue about this, but after searching, I can't find one. These all look different:
2964 was about losing network access when
sys-net
stays on3008/#3030 was about failing to connect to a network after resume when
sys-net
stays on3151 was about NetworkManager not running in
sys-net
after resume whensys-net
stays on3738 was about the entire computer not resuming correctly from suspend.
Ah, maybe I was thinking of #4042, which is a similar report about
sys-usb
.