Open iMicknl opened 4 years ago
@tetienne, @vlebourl what do you think:
backoff
, where we add a debounce time of x seconds with a few retries max.Any other suggestions?
I am currently pursuing option 1 via https://github.com/iMicknl/python-tahoma-api/pull/48. This is the easiest to implement, however it has the following limitations:
I have 14 covers. What can i do to close/open them all at the same time? i getting exactly the same error :(
@kaczkalolo do you encounter this problem for a while, or is this a new issue? There was another report yesterday (https://github.com/iMicknl/ha-tahoma/issues/378).
For now, I would advice to add a delay in your automations in Home Assistant... This will be way more reliable, since this is an issue on the Somfy TaHoma side.
I had this for a while but never really had time to debug but finally found it. I don't use automation for that- only group.
Hi,
I have the same problem but for me there are no 10 long running operations necessary.
At 06:00 rollers at the front-side of the house are openend. This goes always good without errors and always exactly at 06:00.
At 07:00 my rollers at the side of the house are planned to open. This goes almost every time wrong. I get this error:
2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.7s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.6s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.3s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.3s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 1.6s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 3.7s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:00 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.4s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:01 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.2s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:01 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.4s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:02 INFO (MainThread) [backoff] Backing off execute_command(...) for 11.3s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:02 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.0s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:02 INFO (MainThread) [backoff] Backing off execute_command(...) for 0.2s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:02 INFO (MainThread) [backoff] Backing off execute_command(...) for 14.4s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:04 INFO (MainThread) [backoff] Backing off execute_command(...) for 5.7s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:10 INFO (MainThread) [backoff] Backing off execute_command(...) for 3.7s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:13 INFO (MainThread) [backoff] Backing off execute_command(...) for 6.3s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:14 INFO (MainThread) [backoff] Backing off execute_command(...) for 8.8s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:17 INFO (MainThread) [backoff] Backing off execute_command(...) for 27.1s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:19 INFO (MainThread) [backoff] Backing off execute_command(...) for 13.7s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:23 INFO (MainThread) [backoff] Backing off execute_command(...) for 4.1s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:27 INFO (MainThread) [backoff] Backing off execute_command(...) for 80.1s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:30 DEBUG (MainThread) [custom_components.tahoma] Finished fetching device events data in 0.037 seconds 2021-03-17 07:00:33 INFO (MainThread) [backoff] Backing off execute_command(...) for 116.1s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:44 INFO (MainThread) [backoff] Backing off execute_command(...) for 3.2s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:00:47 INFO (MainThread) [backoff] Backing off execute_command(...) for 76.8s (pyhoma.exceptions.TooManyExecutionsException: Server busy, please try again later. (Too many executions)) 2021-03-17 07:01:00 DEBUG (MainThread) [custom_components.tahoma] Finished fetching device events data in 0.111 seconds 2021-03-17 07:01:30 DEBUG (MainThread) [custom_components.tahoma.coordinator] EventName.DEVICE_STATE_CHANGED/None (device: io://0203-9950-4411/11186955, state: None -> None) 2021-03-17 07:01:30 DEBUG (MainThread) [custom_components.tahoma] Finished fetching device events data in 0.132 seconds
The rollers that are opened at 06:00 is done by Node-RED by openening a group:
`cover:
The rollers that are opened at 07:00 (which has the problems) is done by Node-RED by openening a group:
`cover:
`
The rollers are openend a minute or 2 after the scheduled time. And also they are NOT openend at the same time. So you see roller 1 is opening, after that number 2 and then number 3.
Hi,
I started to see this behavior for 13 shutters a few days back. I now also have this issue:
Jul 07 10:50:49 pi hass[944]: 2021-07-07 10:50:49 INFO (MainThread) [backoff] Backing off execute_commands(...) for 0.2s (pyhoma.exceptions.NotAuthenticatedException: Not authenticated)
Jul 07 10:50:49 pi hass[944]: 2021-07-07 10:50:49 ERROR (MainThread) [custom_components.tahoma.tahoma_entity] Too many requests, try again later : login with REDACTED@gmail.com
I can't even login to https://www.tahomalink.com/ as I get the same error:
(Too many requests, try again later : login with REDACTED@gmail.com)
in a pop up message box.
I'm using Google Assistant to open close all the shutters at once. Not sure how to introduce delay on my side. Any known workaround ?
Hi, I don't think this error is linked to the first one. This one happens when you try to open too many connections to the API in a short period of time. If I'm not mistaken, HA keeps its connection open to reuse it for successive actions.
Waiting 10 minutes before retrying to connect is the only fix I know for that one.
Hi @vlebourl. Thank you for clarification and sorry to mixed up issues.
I also have tons of Execution queue is full on gateway: #xxxx-xxxx-xxxx (soft limit: 10)
and I now never have all my shutters closing and opening at the same time. Even though this has worked for quite some time before.
Any idea how to improve when not using an automation ?
Is there a way to have an event if this kind of error happens (like in #573)? This exception catched here:
There is only a log message but can't handle this error from outside (eg: from an automation). More general question: can we have an event (or events) if a command fails (eg: open cover, close cover, etc...)?
@laszlojakab Can you try the master branch? I've added event in case of failure. More information here.
@tetienne Sorry not exactly the same error as in the subject of the issue. I get this error:
2021-09-30 23:03:10 ERROR (MainThread) [custom_components.tahoma.executor] Execution queue is full on gateway: #XXXX-XXXX-XXXX (soft limit: 10)
No event for this kind of error in master branch. As I mentioned this error logged by line 74 in executor.py
@laszlojakab I was able to reproduce your error spamming a script where I open and close many times a cover.
It seems an event is well triggered when a command is not correctly send:
2021-10-04 15:34:13 DEBUG (MainThread) [custom_components.tahoma.coordinator] Event(timestamp=1633354453054, name=<EventName.EXECUTION_STATE_CHANGED: 'ExecutionStateChangedEvent'>, setupoid=741ca89f-a47b-4ad****-894d-d225c06922b1, owner_key=741ca89f-a47b-4ad****-894d-d225c06922b1, type=1, sub_type=1, time_to_next_state=0, failed_commands=None, failure_type_code=None, failure_type=None, condition_groupoid=None, placeoid=None, label=None, metadata=None, camera_id=None, deleted_raw_devices_count=None, protocol_type=None, gateway_id=None, exec_id='4b83103c-ac10-3e01-18e7-2cb65d75c706', deviceurl=None, device_states=[], old_state=<ExecutionState.INITIALIZED: 'INITIALIZED'>, new_state=<ExecutionState.NOT_TRANSMITTED: 'NOT_TRANSMITTED'>)
Or
2021-10-04 15:34:14 DEBUG (MainThread) [custom_components.tahoma.coordinator] Event(timestamp=1633354453894, name=<EventName.EXECUTION_STATE_CHANGED: 'ExecutionStateChangedEvent'>, setupoid=741ca89f-a47b-4ad****-894d-d225c06922b1, owner_key=741ca89f-a47b-4ad****-894d-d225c06922b1, type=1, sub_type=1, time_to_next_state=-1, failed_commands=[{'deviceurl': 'io://****-*****-*****/5978000', 'rank': 0, 'failure_type': 'CMDCANCELLED'}], failure_type_code=<FailureType.CMDCANCELLED: 106>, failure_type='CMDCANCELLED', condition_groupoid=None, placeoid=None, label=None, metadata=None, camera_id=None, deleted_raw_devices_count=None, protocol_type=None, gateway_id=None, exec_id='4b830acf-ac10-3e01-18e7-2cb6cde8f991', deviceurl=None, device_states=[], old_state=<ExecutionState.IN_PROGRESS: 'IN_PROGRESS'>, new_state=<ExecutionState.FAILED: 'FAILED'>)
You can see these logs if you enabled the debug log.
@tetienne perhaps his case will broadcast NOT_TRANSMITTED
and we only create an HA event when the ExecutionState is failed?
I’m sure if the events are linked. But, we can forward more events if needed indeed.
Besides the event generated, is there any intention to have a proper fix to this issue ? This is really introducing lots of problems for me
@rvrignaud That’s not something we can fix easily, if feasible. It means we will have to handle a kind of queue, or something similar. In your case, why not use scenario created on the tahomalink.com side?
Because I'm using Google Assistant voice queries that are triggering native devices command (for 13 devices for now, soon much more) and not a specific routine. Like "OKG close all the shutters." I would prefer as much as possible to divert from native GHA interactions
@tetienne I wonder why the backoff is not working. Perhaps we need to retry for a longer time.
I am not a huge fan of building a queue in the Data Update Coordinator...
@iMicknl Probably yes, we can add some delay to check if it helps.
If you need any help in testing new behavior, I can consistently reproduce this one. Happy to help
Any updates on that issue? Can we get an event (like the errors in overkiz.event
) for this event to do some workaround with automations?
I'm having the same issue when trying to control 15 blinds simultaneously when using the Overkiz integration. Does anybody know what would be the required delay between the calls for Overkiz to accept the requests? (I mean if i split the calls in 10 / 5) calls for example.
Same story here trying to control 14 Somfy blinds at once with Overkiz. All I could do was split them in 7 and 7...
Hi again. The problem still exists. I know the repo is deprecated but I don't want to open a new issue in core for the same error. As I see overkiz.event
is not migrated to core so I can't do any workaround in automation for the soft limit error. There is a continue_on_error
flag for scripts. It would be nice if we could "catch" the soft limit error with this. With this ability after moving the cover I could check if it is in the expected position or if it is moving. If none of them is true then there was an error so I should set the cover position again. The continue_on_error
works only with exceptions.HomeAssistantError
type. See here: https://github.com/home-assistant/core/blob/ba8a530d19dc3e93602275961de46d700611f4a1/homeassistant/helpers/script.py#LL493C24-L493C24
Could you please modify the core version of the integartion to raise HomeAssistantError
in case of soft limit error?
@laszlojakab would be good to raise this on core repo! Happy to see if we can implement it.
Added to core.
Describe the bug Originally reported on Home Assistant Community
Not all executions will succeed, when more than 10 executions are running at the same time.
To Reproduce Steps to reproduce the behavior:
Expected behavior All executions running in a sequence, with a pause.
Environment (please complete the following information):
Additional context