rust-vmm / vhost-device

'vhost-user' device backends workspace
Apache License 2.0
68 stars 48 forks source link

sound: test_pipewire_backend_invalid_stream panics with "Failed to connect to core: CreationFailed" #647

Open epilys opened 7 months ago

epilys commented 7 months ago

Test panics:

2024-04-09 12:11:58 EEST    failures:
2024-04-09 12:11:58 EEST    
2024-04-09 12:11:58 EEST    ---- audio_backends::pipewire::tests::test_pipewire_backend_invalid_stream stdout ----
2024-04-09 12:11:58 EEST    INFO: dbus_session_bus_address=unix:path=/tmp/.tmpJ52ZlF/dbus
2024-04-09 12:11:58 EEST    INFO: Wait for dbus to setup...
2024-04-09 12:11:58 EEST    INFO: Launch pipewire.
2024-04-09 12:11:58 EEST    INFO: Wait for pipewire to setup...
2024-04-09 12:11:58 EEST    thread 'audio_backends::pipewire::tests::test_pipewire_backend_invalid_stream' panicked at vhost-device-sound/src/audio_backends/pipewire.rs:97:42:
2024-04-09 12:11:58 EEST    Failed to connect to core: CreationFailed
2024-04-09 12:11:58 EEST    INFO: Killing pipewire pid 4573
2024-04-09 12:11:58 EEST    ERROR: pipewire stderr [E][00324.655638] mod.rt       | [     module-rt.c:  236 pw_rtkit_bus_get()] Failed to connect to system bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
2024-04-09 12:11:58 EEST    [W][00324.655651] mod.rt       | [     module-rt.c: 1055 pipewire__module_init()] Realtime scheduling disabled: unsufficient realtime privileges, Portal not found on session bus, and no system bus for RTKit: Connection refused
2024-04-09 12:11:58 EEST    [W][00324.658561] default      | [        thread.c:  101 impl_acquire_rt()] acquire_rt thread:0x763141eea640 prio:-1 not implemented
2024-04-09 12:11:58 EEST    
2024-04-09 12:11:58 EEST    INFO: Killing Dbus session 4415
2024-04-09 12:11:58 EEST    INFO: dbus stdout unix:path=/tmp/.tmpJ52ZlF/dbus,guid=e7164449469d64fe0d7646386615065c

From this CI run: (link may expire in the future)

https://buildkite.com/rust-vmm/vhost-device-ci/builds/2362#018ec21f-9400-4792-9cc7-0d3f336e7120

dorindabassey commented 3 months ago

can you reproduce this on the CI? look like a one time failure which I think is because of 2024-04-09 12:11:58 EEST ERROR: pipewire stderr [E][00324.655638] mod.rt | [ module-rt.c: 236 pw_rtkit_bus_get()] Failed to connect to system bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory also i find it weird that test_pipewire_backend_success is passing

epilys commented 3 months ago

Other failures with the same error:

2024-07-29 18:03:40 EEST audio_backends::pipewire::tests::test_pipewire_backend_invalid_stream 2024-07-29 18:03:40 EEST audio_backends::pipewire::tests::test_pipewire_backend_success

https://buildkite.com/rust-vmm/vhost-device-ci/builds/2577#0190ff03-fec1-4b56-baa9-036d8558e628

2024-07-22 08:24:21 EEST failures: 2024-07-22 08:24:21 EEST audio_backends::pipewire::tests::test_pipewire_backend_invalid_stream

https://buildkite.com/rust-vmm/vhost-device-ci/builds/2528#0190d8e5-05e6-4285-b712-7fa32325b738

024-07-15 10:39:14 EEST failures: 2024-07-15 10:39:14 EEST audio_backends::pipewire::tests::test_pipewire_backend_invalid_stream

https://buildkite.com/rust-vmm/vhost-device-ci/builds/2502#0190b554-6579-464f-aa85-5637629753c5

epilys commented 3 months ago

Perhaps it's a timing issue (we need to wait for a few milliseconds/ a second for the bus to setup). What do you think, @dorindabassey?

dorindabassey commented 3 months ago

yeah, it's probably a timing issue because I tested it locally with the CI image and it's passing. so maybe add a condition to wait a few seconds before running these set of test? I'm not sure the best approach to solve this problem.

epilys commented 3 months ago

You can try running it locally again and again for an hour or more until it fails to confirm it happens, then add a sleep 5 before the test starts to see if it makes a difference.

stefano-garzarella commented 3 days ago

Could be related:

https://buildkite.com/organizations/rust-vmm/pipelines/vhost-device-ci/builds/2835/jobs/01934913-4a54-4238-bd0b-014e7135e891/log

epilys commented 3 days ago

It seems like buildkite is cancelling it (IIRC there are no timeouts in cargo tests):

# Received cancellation signal, interrupting

I wonder if there's a way to improve test failure reporting on CI. Could we compile with -Cpanic=abort and RUST_BACKTRACE=full? And hopefully buildkite's cancellation triggers a panic/abort.