Closed link2xt closed 1 month ago
Example of a run that got stuck:
Example of a run that finished successfully:
The main difference is that in failed run DEBUG root:rpc.py:180 account_id=1 got an event {'data': [30], 'kind': 'WebxdcRealtimeData', 'msgId': 13}
never arrives. Message [30]
is never delivered, so somehow ac2 successfully sends message [40]
for second webxdc, but communication from ac2 to ac1 for the first webxdc does not work.
At the same time tests/test_iroh_webxdc.py::test_realtime_simultaneously
which tests the same thing with only one webxdc never fails.
I tried to get the logs with RUST_LOG='iroh=trace'
. Failing test run contains these lines which successful run doesn't:
2024-07-31T01:29:34.381335Z TRACE iroh_quinn_proto::connection: connection closed
...
2024-07-31T01:29:34.381562Z TRACE drive{id=1}:send{space=Data pn=1}: iroh_quinn_proto::connection: sending CONNECTION_CLOSE
...
2024-07-31T01:29:34.386909Z TRACE drive{id=1}:recv{space=Data pn=1}: iroh_quinn_proto::connection: got frame Close(Application(ApplicationClose { error_code: 0, reason: b"" }))
2024-07-31T01:29:34.386915Z TRACE drive{id=1}: iroh_quinn_proto::connection: connection closed
2024-07-31T01:29:34.386923Z TRACE drive{id=1}:send{space=Data pn=1}: iroh_quinn_proto::connection: sending CONNECTION_CLOSE
Also searching for "accepting iroh connection", on successful run connection is accepted only once:
DEBUG root:rpc.py:180 account_id=1 got an event {'kind': 'Info', 'msg': 'src/peer_channels.rs:435: IROH_REALTIME: accepting iroh connection'}
But on failed run connection is accepted by second account as well:
DEBUG root:rpc.py:180 account_id=1 got an event {'kind': 'Info', 'msg': 'src/peer_channels.rs:435: IROH_REALTIME: accepting iroh connection'}
DEBUG root:rpc.py:180 account_id=2 got an event {'kind': 'Info', 'msg': 'src/peer_channels.rs:435: IROH_REALTIME: accepting iroh connection'}
DEBUG root:rpc.py:180 account_id=1 got an event {'kind': 'Info', 'msg': 'src/peer_channels.rs:435: IROH_REALTIME: accepting iroh connection'}
Apparently connection from ac2 to ac1 is closed on failed run and then ac1 has to establish connection to ac2?
I don't see any reason for why connection is closed in the logs, it just says "connection closed" without any higher level logging why it decided to close the connection.
Can't reproduce on 1.142.4 with https://github.com/deltachat/deltachat-core-rust/pull/5860 merged.
Running while pytest tests/test_iroh_webxdc.py::test_two_parallel_realtime_simultaneously; do :; done
for a while and no failures.
Closing as not reproducible. If it fails again in CI we can reopen the issue or create a new one.
It fails in CI sometimes, but can also be reporoduced by running
scripts/make-rpc-testenv.sh
, then. venv/bin/activate
, thenpytest tests/test_iroh_webxdc.py::test_two_parallel_realtime_simultaneously
multiple times until the test gets stuck.