pyalarmdotcom / alarmdotcom

Custom component to allow Home Assistant to interface with Alarm.com
MIT License
114 stars 35 forks source link

Integration no longer has real time updates #389

Open MicroDraco opened 4 months ago

MicroDraco commented 4 months ago

Please provide as much detail as possible. It's helpful to include Home Assistant debug logs from this integration. Instructions for enabling debugging are available at https://github.com/pyalarmdotcom/alarmdotcom/wiki/How-to-Test-&-Debug#enable-code-debugging. (Alarmdotcom debug logs are verbose, so don't forget to disable debug logging when you're done!)

Describe the bug The integration no longer pushes sensor and arm state updates to Home Assistant after a sudden websocket disconnection. It was working perfectly until the connection to the alarm.com servers were interrupted, and then restored after about a minute.

To Reproduce Steps to reproduce the behavior:

  1. Integration works as normal
  2. connection is closed suddenly. home network is fine.
  3. connection to alarm.com is restored, but has resorted to polling for sensor updates.

Expected behavior The connection would be fully restored and Home Assistant would continue to received pushed updates instead of relying on polling.

Screenshots none

Home Assistant Version: 2024.3.0

Additional context Add any other context about the problem here.

Here is a log entry that might be relevant: 2024-03-15 18:38:20.069 ERROR (MainThread) [pyalarmdotcomajax.websockets.client] Unexpected WebSocket error Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/pyalarmdotcomajax/websockets/client.py", line 114, in _connect async with self._websession.ws_connect( File "/usr/local/lib/python3.12/site-packages/aiohttp/client.py", line 1194, in aenter self._resp = await self._coro ^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/aiohttp/client.py", line 832, in _ws_connect resp = await self.request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/aiohttp/client.py", line 605, in _request await resp.start(conn) File "/usr/local/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 966, in start message, payload = await protocol.read() # type: ignore[union-attr] ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 622, in read await self._waiter aiohttp.client_exceptions.ClientOSError: [Errno 104] Connection reset by peer

mikesalz commented 3 months ago

I am receiving the same error repeatedly, as well.

image

mikesalz commented 3 months ago

This error is really blowing up my logs. @elahd Do you happen to have any idea what the issue is? I'm happy to help test any potential fixes. Thanks!

rspierenburg commented 3 months ago

Same here, exact same error. 37 occurances since March 22.

elahd commented 3 months ago

Are you able to test a new version of pyalarmdotcomajax? See #391.

elahd commented 3 months ago

This looks like a combination of an Alarm.com server issue and a weak recovery mechanism in the integration. I've been seeing a lot the same errors in my browser's web console when logged into the Alarm.com website:

image

v6 of pyalarmdotcomajax should be more robust and will pollute your error logs less during recovery.

MicroDraco commented 3 months ago

@elahd is this a standalone python script or can you incorporate it into home assistant ?

elahd commented 3 months ago

It's both. The pyalarmdotcom Python package interfaces with Alarm.com and is used by the alarmdotcom Home Assistant integration. The bundled adc CLI helps with debugging.

The issues with real time updates can't be fixed in the HA integration alone, they need to be fixed in pyalarmdotcomajax.

MicroDraco commented 3 months ago

just installed the beta version. Real time updates are back. I'll report back if something goes wrong.

MicroDraco commented 3 months ago

Same errors after less than an hour...

elahd commented 3 months ago

Does it reconnect? Even the adc website has frequent disconnects.

On Tue, Apr 2, 2024 at 10:10 PM MicroDraco @.***> wrote:

Same errors after less than an hour...

— Reply to this email directly, view it on GitHub https://github.com/pyalarmdotcom/alarmdotcom/issues/389#issuecomment-2033407269, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADR4HENHJSTKVV7OCZMUDLY3NQHVAVCNFSM6AAAAABEYYY34SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZTGQYDOMRWHE . You are receiving this because you were mentioned.Message ID: @.***>

MicroDraco commented 3 months ago

The websocket doesn’t recover

elahd commented 3 months ago

@MicroDraco Can you try running the stream command with the -d flag to enable debugging mode? This will reveal the actual error that blocks recovery.

adc stream -d -u "USERNAME" -p "PASSWORD"

MicroDraco commented 3 months ago

home-assistant_2024-04-03T22-23-40.187Z.log it seems as if the websocket recovered ONCE while I was at work, then failed to reconnect after an hour.

MicroDraco commented 3 months ago

home-assistant_alarmdotcom_2024-04-03T23-48-09.422Z.log

here is another set of debugs logs

elahd commented 3 months ago

@MicroDraco Thanks for that. I updated the branch with some fixes. The client's connection is now as stable as the Alarm.com website (so not very stable), but it's better at reconnecting once the connection is lost.

The reconnect workflow will run when:

  1. The ADC server kicks you off. (WebSocket error 104, which impacts both the pyadc client and the alarm.com website.)
  2. The main alarm.com session expires. (HTTP error 401)
  3. Timeout / authentication / other HTTP 4xx errors
  4. All other errors will kill the client.

The workflow is:

  1. Wait a random amount of time before reconnecting, up to a minute.
  2. Request a new auth token for the WebSocket connections. (If the main login session has expired, log back in to alarm.com first.)
  3. Try reconnecting to the WebSocket endpoint.
  4. Wait 5 seconds. If we're still connected to the server, do a full pull refresh to update state.

The client will die if 25 consecutive connection attempts fail or if the server asks for an OTP (it shouldn't).

There's no way to track the actual events that were lost while disconnected. That is, if a door was closed while connected, then was opened and closed again while disconnected, that opened event will be lost. Alarm.com doesn't have a good interface for requesting historical events. The "activity" page on their website doesn't present event data in a way that can be (easily) processed programmatically.

To test, run the pip install command you ran before to pull the latest code.

MicroDraco commented 3 months ago

I see no changes on the repo? Last change was two weeks ago.

elahd commented 3 months ago

It's the refactor-2024 branch at https://github.com/pyalarmdotcom/pyalarmdotcomajax/tree/refactor-2024

MicroDraco commented 3 months ago

I want to incorporate this into the home assistant integration but I get an error "Setup failed for custom integration 'alarmdotcom': Unable to import component: cannot import name 'API_URL_BASE' from 'pyalarmdotcomajax.const' (/usr/local/lib/python3.12/site-packages/pyalarmdotcomajax/const.py)".

I'm not sure what files to replace.

elahd commented 3 months ago

This won’t work with the existing HA integration — the new branch isn’t backward compatible. I’m working on updating the integration. Right now, I’m just looking for help testing the underlying module.

On Thu, Apr 11, 2024 at 9:57 PM MicroDraco @.***> wrote:

I want to incorporate this into the home assistant integration but I get an error "Setup failed for custom integration 'alarmdotcom': Unable to import component: cannot import name 'API_URL_BASE' from 'pyalarmdotcomajax.const' (/usr/local/lib/python3.12/site-packages/pyalarmdotcomajax/const.py)".

I'm not sure what files to replace.

— Reply to this email directly, view it on GitHub https://github.com/pyalarmdotcom/alarmdotcom/issues/389#issuecomment-2050830103, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADR4HF6W4L25QR2UQP6I4LY445SDAVCNFSM6AAAAABEYYY34SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJQHAZTAMJQGM . You are receiving this because you were mentioned.Message ID: @.***>

lbreggi commented 2 months ago

Hi @elahd I can help testing, would be possible to give me the intructions to migrate and what you need to test? thanks!