QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
528 stars 46 forks source link

`qubes.SuspendPost` fails with vchan connection timeout #5702

Closed pwmarcz closed 11 months ago

pwmarcz commented 4 years ago

Qubes OS version R4.0 Computer: Lenovo X230

Affected component(s) or functionality Resume from suspend, sys-net, sys-usb

Brief summary After resuming from suspend, Qubes services related to USB and network fail to start again (e.g. lsusb empty in sys-usb, no external network interfaces in sys-net).

To Reproduce

  1. Suspend the machine (by closing lid)
  2. Resume (by opening lid)
  3. Unlock screen
  4. Open network manager in system tray
  5. Try to use USB input devices

Expected behavior I should be able to connect to the network using Network Manager, and use USB devices.

Actual behavior Network Manager says that networking is not available. ifconfig in sys-net shows only internal networks (lo and vif6.0). USB devices are not available acording to Qubes device manager, and lsusb in sys-usb is empty. This fails for me about 50% of the time.

Additional context Here is a relevant part of dom0 qubesd log, just after resume.

Mar 03 11:50:53 dom0 qubesd[6943]: Task exception was never retrieved
Mar 03 11:50:53 dom0 qubesd[6943]: future: <Task finished coro=<QubesVM.resume() done, defined at /usr/lib/python3.5/site-packages/qubes/vm/qubesvm.py:1277> exception=CalledProcessError(1, 'qubes.SuspendPost', b'', b'vchan connection timeout\n')>
Mar 03 11:50:53 dom0 qubesd[6943]: Traceback (most recent call last):
Mar 03 11:50:53 dom0 qubesd[6943]:   File "/usr/lib64/python3.5/asyncio/tasks.py", line 240, in _step
Mar 03 11:50:53 dom0 qubesd[6943]:     result = coro.send(None)
Mar 03 11:50:53 dom0 qubesd[6943]:   File "/usr/lib/python3.5/site-packages/qubes/vm/qubesvm.py", line 1290, in resume
Mar 03 11:50:53 dom0 qubesd[6943]:     user='root')
Mar 03 11:50:53 dom0 qubesd[6943]:   File "/usr/lib/python3.5/site-packages/qubes/vm/qubesvm.py", line 1387, in run_service_for_stdio
Mar 03 11:50:53 dom0 qubesd[6943]:     args[0], *stdouterr)
Mar 03 11:50:53 dom0 qubesd[6943]: subprocess.CalledProcessError: Command 'qubes.SuspendPost' returned non-zero exit status 1
Mar 03 11:50:53 dom0 qubesd[6943]: Task exception was never retrieved
Mar 03 11:50:53 dom0 qubesd[6943]: future: <Task finished coro=<QubesVM.resume() done, defined at /usr/lib/python3.5/site-packages/qubes/vm/qubesvm.py:1277> exception=CalledProcessError(1, 'qubes.SuspendPost', b'', b'vchan connection timeout\n')>
Mar 03 11:50:53 dom0 qubesd[6943]: Traceback (most recent call last):
Mar 03 11:50:53 dom0 qubesd[6943]:   File "/usr/lib64/python3.5/asyncio/tasks.py", line 240, in _step
Mar 03 11:50:53 dom0 qubesd[6943]:     result = coro.send(None)
Mar 03 11:50:53 dom0 qubesd[6943]:   File "/usr/lib/python3.5/site-packages/qubes/vm/qubesvm.py", line 1290, in resume
Mar 03 11:50:53 dom0 qubesd[6943]:     user='root')
Mar 03 11:50:53 dom0 qubesd[6943]:   File "/usr/lib/python3.5/site-packages/qubes/vm/qubesvm.py", line 1387, in run_service_for_stdio
Mar 03 11:50:53 dom0 qubesd[6943]:     args[0], *stdouterr)
Mar 03 11:50:53 dom0 qubesd[6943]: subprocess.CalledProcessError: Command 'qubes.SuspendPost' returned non-zero exit status 1

It seems that qubes.SuspendPost failed for the machine because of vchan connection timeout.

Here is a part of journalctl for sys-net, just before and just after resume. It's heavily snipped, I can fish for more details if I know what to look for:

Mar 02 17:24:42 sys-net systemd[1]: Started Network Manager Script Dispatcher Service.
Mar 02 17:24:42 sys-net nm-dispatcher[3071]: req:1 'down' [wls7]: start running ordered scripts...
...
Mar 02 17:24:43 sys-net systemd[1]: session-c7.scope: Succeeded.
Mar 02 17:24:43 sys-net qrexec-agent[472]: eintr
Mar 03 11:50:53 sys-net kernel: Freezing user space processes ... (elapsed 0.013 seconds) done.
Mar 03 11:50:53 sys-net kernel: OOM killer disabled.
Mar 03 11:50:53 sys-net kernel: Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
Mar 03 11:50:53 sys-net kernel: suspending xenstore...
Mar 03 11:50:54 sys-net kernel: Xen Platform PCI: I/O protocol version 1
Mar 03 11:50:54 sys-net kernel: xen:grant_table: Grant tables using version 1 layout
Mar 03 11:50:54 sys-net kernel: OOM killer enabled.
Mar 03 11:50:54 sys-net kernel: Restarting tasks ... done.
...
Mar 03 11:50:55 sys-net qrexec-agent[472]: executed root:QUBESRPC qubes.SuspendPostAll dom0 pid 3155
...
Mar 03 11:50:55 sys-net systemd[778]: run-user-0.mount: Succeeded.
...
Mar 03 11:50:58 sys-net systemd[3179]: Reached target Main User Target.
Mar 03 11:50:58 sys-net systemd[3179]: Startup finished in 537ms.
Mar 03 11:50:58 sys-net systemd[1]: Started User Manager for UID 0.
Mar 03 11:50:58 sys-net audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=user@0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 03 11:50:58 sys-net systemd[1]: Started Session c8 of user root.
...
Mar 03 11:50:58 sys-net systemd[1]: session-c8.scope: Succeeded.
Mar 03 11:50:58 sys-net qrexec-agent[472]: send exit code 0
Mar 03 11:50:58 sys-net qrexec-agent[472]: pid 3155 exited with 0
Mar 03 11:50:58 sys-net qrexec-agent[472]: eintr

It seems weird to me that the messages about suspending different services happen after waking the computer. The logs for other machines look similar.

Solutions you've tried

Relevant documentation you've consulted https://www.qubes-os.org/doc/wireless-troubleshooting/ -- seems not to be my case?

Related, non-duplicate issues

4658 (but it's about sys-net actually dying)

4457 (seems to be a similar case with Windows?)

adrelanos commented 4 years ago

https://github.com/QubesOS/qubes-issues/issues/4892 might be related.

github-actions[bot] commented 11 months ago

This issue is being closed because:

If anyone believes that this issue should be reopened and reassigned to an active milestone, please leave a brief comment. (For example, if a bug still affects Qubes OS 4.1, then the comment "Affects 4.1" will suffice.)