QubesOS / qubes-issues

The Qubes OS Project issue tracker
https://www.qubes-os.org/doc/issue-tracking/
536 stars 47 forks source link

[Build 2020033017-4.1] openQA test fails in TC_00_AppVM_fedora-30.test_097_qrexec_service_socket_vm_eof_reverse #5749

Open marmarek opened 4 years ago

marmarek commented 4 years ago

Observation

openQA test in scenario qubesos-4.1-pull-requests-x86_64-system_tests_basic_vm_qrexec_gui@64bit fails in TC_00_AppVM_fedora-30

# test_097_qrexec_service_socket_vm_eof_reverse
# failure: 

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/qubes/tests/__init__.py", line 501, in cleanup_loop
    == int(self.loop._ssock is not None)
AssertionError

# system-out: 

2020-03-31 00:49:48.426 qrexec-client[221655]: process_io.c:181:process_io: vchan connection closed early (fds: 1 0 -1, status: -1 -1)

Test suite description

This test is supposed to test VM socket service - if doing shutdown(SHUT_WR) on the socket there is transferred as EOF to dom0 (qrexec-client).

Reproducible

Fails since (at least) Build 2020032011-4.1

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

The actual failure message is weird, as it's about FD leak, not timeout on waiting for EOF. But the above qrexec-client output suggests something is wrong.

cc @pwmarcz

marmarek commented 4 years ago

I'm not entirely sure if the qrexec-client message is really about the test call and not some auxiliary call during normal VM life cycle. In the very next test I see:

qubes.tests.integ.vm_qrexec_gui/TC_00_AppVM_fedora-30/test_098_qrexec_service_socket_vm_eof
Test for EOF transmission dom0->VM(socket) ... CRITICAL:qubes.tests.integ.vm_qrexec_gui.TC_00_AppVM_fedora-30.test_098_qrexec_service_socket_vm_eof:starting
2020-03-31 00:49:51.003 qrexec-client[221658]: process_io.c:181:process_io: vchan connection closed early (fds: -1 0 -1, status: -1 -1)
WARNING:vm.test-inst-vm1:Activating the test-inst-vm1 VM

A bit early to be about this test, maybe some remainder of the previous test?

marmarek commented 4 years ago

I have ran those tests locally. They both passed, but I still got exactly the same qrexec-client messages. And only for those two tests (097 and 098), not the others, so it's unlikely side effect of just VM startup.

pwmarcz commented 4 years ago

Are you sure these are not cleanup issues? I don't see any message about timeout in logs.

In test_097_qrexec_service_socket_vm_eof_reverse, you are creating two qrexec-client processes (service_proc, run_service) but not waiting for any of them to finish (and they will execute for 15 seconds). That would account for the message about vchan connection close.

marmarek commented 4 years ago

I have added cleanup of service_proc to tearDown, but apparently I have forgot to create PR about it: https://github.com/QubesOS/qubes-core-admin/pull/332. Let me see if that helps.

marmarek commented 4 years ago

With this cleanup the same: tests succeeded, but with this message present (only for those tests):

dom0: qubes.tests.integ.vm_qrexec_gui/TC_00_AppVM_fedora-30/test_097_qrexec_service_socket_vm_eof_reverse
  Test for EOF transmission VM(socket)->dom0 ... 2020-04-01 08:57:43.428 qrexec-client[786211]: process_io.c:187:process_io: vchan connection closed early (fds: -1 0 -1, status: -1 -1)
ok
dom0: qubes.tests.integ.vm_qrexec_gui/TC_00_AppVM_fedora-30/test_098_qrexec_service_socket_vm_eof
  Test for EOF transmission dom0->VM(socket) ... 2020-04-01 08:58:32.875 qrexec-client[788149]: process_io.c:187:process_io: vchan connection closed early (fds: 1 -1 -1, status: -1 -1)
ok