Closed maksimdrachov closed 1 year ago
The above warning-message appears. I don't see why though
IIRC when you cancel a coroutine, it throws an asyncio.CancelledError
which has to be consumed somehow; otherwise, you get the warning.
Now comes the confusing part, the previous the unit tests should indicate that the heartbeat mechanism works correctly, however it starts to mess up with the last 2 unit tests?
This means that some resources have not been finalized in the previous tests.
Ok, class pay attention!
If you need to schedule a delayed callback with a regular function, you can do the following:
self._timer = loop.call_at(self._next_timeout, self._on_timeout)
However, if you'd like to schedule a delayed callback with a coroutine, you'll have to do the following:
self._timer = loop.call_at(self._next_timeout, lambda: self._on_timeout())
If you do the following instead:
self._timer = loop.call_at(self._next_timeout, asyncio.create_task, self._on_timeout())
It will work, however if you try to cancel this task it will issue a warning:
RuntimeWarning: coroutine 'RaftNode._on_timeout' was never awaited
(Due to the task already being scheduled, and calling cancel on the TimerHandle
won't change this fact)
If you need to schedule a delayed callback with a regular function, you can do the following:
Makes sense.
However, if you'd like to schedule a delayed callback with a coroutine, you'll have to do the following:
Here, self._on_timeout
is equivalent to lambda: self._on_timeout()
aside from the extra level of indirection in the latter. The lambda will return an awaitable item that will have to be awaited by the caller, but the timer implementation will not do it. Why does this work?
Sorry, correction:
self._timer = loop.call_at(self._next_timeout, lambda: self._on_timeout())
should be:
self._timer = loop.call_at(
self._next_timeout, lambda: asyncio.create_task(self._on_timeout())
)
Is this not equivalent to:
self._timer = loop.call_at(
self._next_timeout, asyncio.create_task, self._on_timeout()
)
EDIT: disregard that, it is not. The time of invocation of on_timeout is different.
The leader election "works", however there's a couple of small issues I'm still struggling with. Fundamentally they are all related to scheduling/cancelling the delayed callbacks (for term/election timeouts).
Running
tests/raft_leader_election.py
:1)
pytest -k _unittest_raft_fsm_1 --pdb
The above warning-message appears. I don't see why though. At the end I'm using
cancel()
; which should take care care of cancelling both callbacks:https://github.com/OpenCyphal-Garage/cyraft/blob/a8e2121532546285fbd43818ed6bfa7a46ffd62b/cyraft/node.py#L717-L727
The thing is, it does take care of cancelling those callbacks, for example if add an additional 10 seconds of sleep (expecting that if the callback was not cancelled properly it would have enough time to execute), it doesn't execute anything:
(Nothing happens.)
2) Now comes the confusing part, the previous the unit tests should indicate that the heartbeat mechanism works correctly, however it starts to mess up with the last 2 unit tests? For example running:
pytest -k _unittest_raft_fsm_3 --pdb
Even though node 43 gets the heartbeat message as expected, it resets the election timeout timer, it still somehow ends up timing out on the election timeout set at the start of the unit test?
Am I missing something?