humbertogontijo / python-roborock

Python library & console tool for controlling Roborock vacuum
GNU General Public License v3.0
63 stars 29 forks source link

InvalidStateError in RoborockFuture using roborock client #222

Open allenporter opened 2 months ago

allenporter commented 2 months ago

I don't know the exact steps to reproduce this, but noticed this while changing a device settings in HomeAssistant. It seems like this is an internal consistency issue and not something that that the caller should be able to introduce, but i'm not positive.

2024-08-20 03:49:28.365 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback RoborockFuture._resolve((None, None)) (None):   File "/usr/local/lib/python3.12/asyncio/base_events.py", line 639, in run_forever
    self._run_once()
  File "/usr/local/lib/python3.12/asyncio/base_events.py", line 1977, in _run_once
    handle._run()
  File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.12/asyncio/selector_events.py", line 970, in _read_ready
    self._read_ready_cb()
  File "/usr/local/lib/python3.12/asyncio/selector_events.py", line 1027, in _read_ready__data_received
    self._protocol.data_received(data)
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/local_api.py", line 39, in data_received
    self.on_message_received(parser_msg)
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/version_1_apis/roborock_client_v1.py", line 444, in on_message_received
    queue.resolve((data.payload, None))
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/roborock_future.py", line 22, in resolve
    self.loop.call_soon_threadsafe(self._resolve, item)
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/home/vscode/.local/lib/python3.12/site-packages/roborock/roborock_future.py", line 19, in _resolve
    self.fut.set_result(item)
asyncio.exceptions.InvalidStateError: invalid state

I believe this happened when i adjusted the same property two times in a row in a short succession (with a short pause between)

Lash-L commented 2 months ago

I have also seen this issue but have not had a good way to recreate or debug. Would be open to any suggestions or even some debug improvements to figure out the cause

allenporter commented 1 month ago

My honest take is this code is quite complex and needs to be simplified. I would suggest:

Lash-L commented 3 weeks ago

My honest take is this code is quite complex and needs to be simplified. I would suggest:

  • making breaking changes to drop all the sync APIs and focus on simpler async APIs.
  • reduce the class hierarhcy so there are not multiple base classes involved, and use composition rather than inheritance.
  • adding logging for each outgoing request seq id and received resopnse seq id could help show the issue happening in more detail.
  • add guards to not send a message with a request id that has already been allocated. One example of a potential bug is that the choosing of a request id can pick an id that already exists since its just picking random numbers.
  • Add an explicit guard here when the future is already completed.
  • Add a guard in _async_response so that it doesn't overwrite an existing future if the request id is already in use
  • There are cases where a response is received and it checks against the queue and pulls out the message, but then if it doesn't already exist it continues anyway. Why would this ever be allowed to happen? seems like a bug or malformed response so it also needs logging rather than being ignored since it could hide concurrency bugs
  • Remove the multiple levels of futures happening between asyncio.ensure_future then a sync method _async_response which calls an async method _wait_response which directly awaits on a future. I don't think all these levels of indirection are needed
  • Don't pass around tuples of (response, exception). Just use the future built in exceptions

That's probably all fair. It's just kind of been the nature of constantly reverse engineering and learning about the underlying api and most improvements being iterative opposed to a clean rewrite.

For instance, at first we didn't know how to do local api, then we figured that out. The same thing with the drastically different A01 api vs the V1 api. Things like that where slowly changes were made and the code got more and more complex.