home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
73.23k stars 30.58k forks source link

When Amcrest binary sensor goes offline sometimes it never reconnects #28539

Closed dshokouhi closed 4 years ago

dshokouhi commented 4 years ago

Home Assistant release with the issue:

0.101.2

Last working Home Assistant release (if known):

0.100.2 Operating environment (Hass.io/Docker/Windows/etc.):

venv

Integration:

amcrest https://www.home-assistant.io/integrations/amcrest/

Description of problem:

When the camera becomes unavailable due to some connection error sometimes it never reconnects and I have to restart HA for it to reconnect.

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

amcrest:
 - host: 192.168.1.42
   port: 1025
   username: !secret amcrest_user
   password: !secret amcrest_pass
   name: "Den Camera"
   resolution: high
   stream_source: rtsp
   binary_sensors:
   - motion_detected

Traceback (if applicable):

2019-11-04 13:21:07 ERROR (SyncWorker_4) [homeassistant.components.amcrest] Den Camera camera offline: Too many errors
2019-11-04 13:21:07 ERROR (SyncWorker_4) [homeassistant.components.amcrest.binary_sensor] Could not update Den Camera Motion Detected binary sensor due to error: CommError
2019-11-04 13:22:11 WARNING (SyncWorker_0) [amcrest.http] <AMC000Y1_4ANYP5:SERIAL> Trying again due to error: ReadTimeout(ReadTimeoutError("HTTPConnectionPool(host='192.168.1.42', port=1025): Read timed out. (read timeout=3.05)",),)

Additional information: After the traceback no attempt to reconnect is made. Possibly caused by #26630. I never had this issue until I upgraded to 0.101.2 Previously the camera would go offline and less than a minute later it will come back. Now it has been unavailable for a long time.

image

dshokouhi commented 4 years ago

Surprisingly I am still able to view the stream from the camera in HA even though the camera is also unavailable the device still does not go back to being found.

hawkeye217 commented 4 years ago

Just experienced the same issue on 0.101.2. A reboot of the affected camera doesn't bring the devices/sensors back online, but like @dshokouhi, I can continue to stream from the camera.

Mark612 commented 4 years ago

Confirmed, also using 0.101.2 Binary_sensors motion_detected will become 'unavailable', and once unavailable will never recover. I did not have this issue prior to 0.101

Mark612 commented 4 years ago

Any update? Possibly caused by #26630.

dshokouhi commented 4 years ago

@antlarr would you be able to take a look at this?

Mark612 commented 4 years ago

Still an issue.

hawkeye217 commented 4 years ago

Still an issue for me on 0.103.5. @pnbruckner, any ideas?

pnbruckner commented 4 years ago

@dshokouhi @hawkeye217

I didn't make the change, so I can't really say what effect it might have. FWIW I'm now using 0.103.5 and have not experienced any (new) issues. But then again, I only recently upgraded from 0.95.4.

I would say @antlarr needs to comment, or maybe @pvizeli who approved the change. Maybe the change should be reverted???

I did quite a bit of work to improve the error handling of this integration, but I'll admit it's still not where it needs to be. E.g., at least for me, it still sometimes fails to take a snapshot when requested. Also maybe it should be converted from an optimistic mode to periodic updates (to account for changes made to the cameras external to HA.)

I hope to do some more work on it and its PYPI package but it probably won't be anytime too soon.

Mark612 commented 4 years ago

Still a very impacting issue and I agree, back out #26630.

Looking at the various threads, it appears #26630 did not impact very many users while the derivatives are wide spread. I would highly recommend backing out #26630, retest to make sure this issue is fixed, then look for alternative approaches that #26630 was trying to resolve.

hawkeye217 commented 4 years ago

Thanks @pnbruckner, and agreed, @Mark612.

I have access to two different HA installations, both with an Amcrest IP2M-841 set up using the amcrest component. They both are using the stream component and both have the motion_detected binary sensor enabled. I often notice Amcrest errors in the logs.

From the first installation:

2020-01-05 17:17:52 WARNING (SyncWorker_10) [amcrest.http] <MyCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='192.168.1.120', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 192.168.1.120 timed out. (connect timeout=3.05)'))"))

From the second installation:

2020-01-05 05:15:16 WARNING (SyncWorker_11) [amcrest.http] <AmcrestCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='10.0.0.175', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 10.0.0.175 timed out. (connect timeout=3.05)'))"))
2020-01-05 05:15:19 WARNING (MainThread) [homeassistant.components.binary_sensor] Updating amcrest binary_sensor took longer than the scheduled update interval 0:00:05
2020-01-05 05:15:19 ERROR (SyncWorker_11) [homeassistant.components.amcrest.binary_sensor] Could not update Camera 2 Motion Detected binary sensor due to error: CommError
2020-01-05 05:15:28 WARNING (SyncWorker_7) [amcrest.http] <AmcrestCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='10.0.0.175', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 10.0.0.175 timed out. (connect timeout=3.05)'))"))
2020-01-05 05:15:31 WARNING (MainThread) [homeassistant.components.binary_sensor] Updating amcrest binary_sensor took longer than the scheduled update interval 0:00:05
2020-01-05 05:15:31 ERROR (SyncWorker_7) [homeassistant.components.amcrest.binary_sensor] Could not update Camera 2 Motion Detected binary sensor due to error: CommError
2020-01-05 05:15:40 WARNING (SyncWorker_17) [amcrest.http] <AmcrestCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='10.0.0.175', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 10.0.0.175 timed out. (connect timeout=3.05)'))"))
2020-01-05 05:15:43 WARNING (MainThread) [homeassistant.components.binary_sensor] Updating amcrest binary_sensor took longer than the scheduled update interval 0:00:05
2020-01-05 05:15:43 ERROR (SyncWorker_17) [homeassistant.components.amcrest.binary_sensor] Could not update Camera 2 Motion Detected binary sensor due to error: CommError
2020-01-05 05:15:52 WARNING (SyncWorker_12) [amcrest.http] <AmcrestCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='10.0.0.175', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 10.0.0.175 timed out. (connect timeout=3.05)'))"))
2020-01-05 05:15:52 ERROR (SyncWorker_12) [homeassistant.components.amcrest.binary_sensor] Could not update Camera 2 Motion Detected binary sensor due to error: CommError
2020-01-05 05:15:58 WARNING (SyncWorker_13) [amcrest.http] <AmcrestCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='10.0.0.175', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 10.0.0.175 timed out. (connect timeout=3.05)'))"))
2020-01-05 05:15:58 ERROR (SyncWorker_13) [homeassistant.components.amcrest.binary_sensor] Could not update Camera 2 Motion Detected binary sensor due to error: CommError
2020-01-05 05:16:04 WARNING (SyncWorker_15) [amcrest.http] <AmcrestCam:AMC000ABCD123DEFGH> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='10.0.0.175', port=80): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 10.0.0.175 timed out. (connect timeout=3.05)'))"))
2020-01-05 05:16:07 WARNING (MainThread) [homeassistant.components.binary_sensor] Updating amcrest binary_sensor took longer than the scheduled update interval 0:00:05
2020-01-05 05:16:07 ERROR (SyncWorker_15) [homeassistant.components.amcrest] Camera 2 camera offline: Too many errors
2020-01-05 05:16:07 ERROR (SyncWorker_15) [homeassistant.components.amcrest.binary_sensor] Could not update Camera 2 Motion Detected binary sensor due to error: CommError

It seems like the camera just doesn't respond after a while - whether it's a network issue or a camera firmware issue, I'm not sure. Nonetheless, before @antlarr 's change, it always recovered. Afterwards, it never does.

antlarr commented 4 years ago

I'm sorry for the late reply, I just got a moment today to check all this and if you can wait a few days, I will try to find some time during this week to try to reproduce and fix this. Otherwise, I can understand that you prefer to revert the change in the meantime.

sergey-koba-mobidev commented 4 years ago

I can confirm this also happens to me.

sdrapha commented 4 years ago

I've got to the point where I created an automation to restart hass when the connection get lost. I have an IP3M-941 that is kind of faulty, It reboots randomly up to 15 times a day for no reason, and most of the time, HASS won't connect back again unless I restart it. My automation is based on a template analyzing the amcrest binary sensor against a ping sensor, so I know when the camera is back online and the problem is the hass integration.

antlarr commented 4 years ago

I have updated my home-assistant installation to 0.103.6 since it was too old and also updated some of the required dependencies as well as the amcrest python module to 1.5.4 (which btw, includes a similar fix that I submitted there in https://github.com/tchellomello/python-amcrest/pull/124). Also, note that home-assistant 0.103.6 includes the change in #26630.

With that setup, I can switch off my cameras (an IP3M-941 and a IP4M-1051 with the latest firmware versions) and then switch them on again and home-assistant reconnects to them normally (they take around 90 seconds to boot up, during which they can't be accessed even from the web interface, but once they boot, home-assistant connects to them without any problem). If I switch off the cameras again, binary_sensor.<camera_name>_online and binary_sensor.<camera_name>_motion_detected become off and unavailable respectively after a minute or so.

Then I tried to test reverting my change in #26630 by manually commenting the lines :

                with self._token_lock:
                    self._token = None

in components/amcrest/init.py, I restarted home-assistant and tested everything again. Everything works as expected too. I can switch cameras off and on again and home-assistant reconnects correctly since python-amcrest 1.5.4 includes the fix mentioned above.

Just in case it helps, I'll explain how I switch my cameras on and off. I have each camera connected through a smart power plug, then in home-assistant I have one input_boolean and one timer (with duration 00:01:30) for each camera and automations like these:

- alias: 'Turn camera on: door'
  trigger:
    - platform: state
      entity_id: input_boolean.camera_door
      from: 'off'
      to: 'on'
  action:
    - service: switch.turn_on
      entity_id: switch.meross_door
    - service: timer.start
      entity_id: timer.camera_door
    - delay: '00:01:15'
    - service: amcrest.goto_preset
      entity_id: camera.door
      data:
          preset: 1
    - delay: '00:00:15'
    - service: camera.enable_motion_detection
      entity_id: camera.door
    - service: amcrest.enable_motion_recording
      entity_id: camera.door
    - service: amcrest.goto_preset
      entity_id: camera.door
      data:
          preset: 1

- alias: 'Turn camera off: door'
  trigger:
    - platform: state
      entity_id: input_boolean.camera_door
      from: 'on'
      to: 'off'
  action:
    - service: camera.disable_motion_detection
      entity_id: camera.door
    - service: amcrest.disable_motion_recording
      entity_id: camera.door
    - service: timer.finish
      entity_id: timer.camera_door
    - service: amcrest.goto_preset
      entity_id: camera.door
      data:
          preset: 2
    - delay: '00:00:10'
    - service: switch.turn_off
      entity_id: switch.meross_door

In each camera I have defined preset 1 as pointing forward and preset 2 as pointing backwards (to the wall). This way, when I switch on a camera, home-assistant switches the power plug on, waits until the camera is ready and makes it point forward. And when I switch the camera off, home-assistant makes it point backwards, waits for the movement to finish and switches the power plug off.

So, in conclusion, I guess it's ok if you want to revert the change in #26630 since it doesn't seem to make a difference with amcrest 1.5.4 but I'd like to ask everyone reporting to be affected by this issue... What version of python-amcrest are you using? If you're not using 1.5.4, can you upgrade to that version, comment out those lines above in components/amcrest/__init__.py and write here if that works fine for you? If you still have issues, can you provide specific instructions on how to reproduce the problem?

dshokouhi commented 4 years ago

@antlarr everyone here is on the latest version of HA, that is the first troubleshooting step we all take. I am on 0.103.6 and I still have this issue randomly. There are no specific steps to reproduce, we are just keeping HA and up and running and randomly it will disconnect on its own without warning.

antlarr commented 4 years ago

@antlarr everyone here is on the latest version of HA, that is the first troubleshooting step we all take.

Of course, that's why I wanted to upgrade my installation before testing anything, because I was still running 0.98.5 as I didn't have time to update it in the last months. Note that I only mentioned it to explain what I did, but I didn't ask anyone to update anything. I assumed you already updated your systems of course. I just asked the version of the amcrest python module (the one from https://pypi.org/project/amcrest/ ) you're using, to check if you're using 1.5.4 (the newest one) or 1.5.3 (the one currently required by components/amcrest/manifest.json).

I am on 0.103.6 and I still have this issue randomly. There are no specific steps to reproduce, we are just keeping HA and up and running and randomly it will disconnect on its own without warning.

In the description of this issue you wrote When the camera becomes unavailable due to some connection error sometimes it never reconnects and I have to restart HA for it to reconnect, so you can understand where the confusion comes from. I see now that the problem is instead that the camera becomes unavailable and then it doesn't automatically reconnect. That's different and not what I was testing for, indeed.

Btw, when I submitted the patch I had the cameras running for many days without problems (with home-assistant 0.98.5).

I'll test with 0.103.6 leaving the cameras active and checking often if the online and motion_detected binary_sensors change to off/unavailable.

Btw, I see that in your configuration file you have:

      binary_sensors:
        - motion_detected

Could you replace that with:

      binary_sensors:
        - motion_detected
        - online

to check also the state of the online binary_sensor?

(I guess most probably binary_sensor.camera_name_online will be set to off but if it doesn't do that, it would be a clue to know what's happening)

antlarr commented 4 years ago

Btw, do you have any automation or any action that runs periodically any service on your camera? Do you keep a web browser open with HA showing the stream (or snapshots) from the camera? Does this also happen just with HA running but not being actively used?

dshokouhi commented 4 years ago

I am not accessing the stream and I do not have services that run actions on the camera. I am just looking at the motion events and using automations based on the state. This happens very randomly I have no idea what usage HA has at the time. Sometimes I wake up and it happens and sometimes it happens during the day. Funny enough this just happened to me an hour ago and looking at my network logs it appears that the camera switched access points during this time and the following errors occurred.

2020-01-11 16:47:24 WARNING (SyncWorker_4) [amcrest.http] <SERIAL> Trying again due to error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at ADDRESS>: Failed to establish a new connection: [Errno 111] Connection refused'))"))
2020-01-11 16:47:24 ERROR (SyncWorker_4) [homeassistant.components.amcrest.binary_sensor] Could not update Baby Monitor Motion Detected binary sensor due to error: CommError
2020-01-11 16:47:30 WARNING (SyncWorker_12) [amcrest.http] <SERIAL> Trying again due to error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at ADDRESS>: Failed to establish a new connection: [Errno 111] Connection refused'))"))
2020-01-11 16:47:30 ERROR (SyncWorker_12) [homeassistant.components.amcrest.binary_sensor] Could not update Baby Monitor Motion Detected binary sensor due to error: CommError
2020-01-11 16:47:36 WARNING (SyncWorker_7) [amcrest.http] <SERIAL> Trying again due to error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at ADDRESS>: Failed to establish a new connection: [Errno 111] Connection refused'))"))
2020-01-11 16:47:36 ERROR (SyncWorker_7) [homeassistant.components.amcrest.binary_sensor] Could not update Baby Monitor Motion Detected binary sensor due to error: CommError
2020-01-11 16:47:42 WARNING (SyncWorker_12) [amcrest.http] <SERIAL> Trying again due to error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at ADDRESS>: Failed to establish a new connection: [Errno 111] Connection refused'))"))
2020-01-11 16:47:42 ERROR (SyncWorker_12) [homeassistant.components.amcrest.binary_sensor] Could not update Baby Monitor Motion Detected binary sensor due to error: CommError
2020-01-11 16:47:48 WARNING (SyncWorker_7) [amcrest.http] <SERIAL> Trying again due to error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at ADDRESS>: Failed to establish a new connection: [Errno 111] Connection refused'))"))
2020-01-11 16:47:48 ERROR (SyncWorker_7) [homeassistant.components.amcrest.binary_sensor] Could not update Baby Monitor Motion Detected binary sensor due to error: CommError
2020-01-11 16:47:54 WARNING (SyncWorker_9) [amcrest.http] <SERIAL> Trying again due to error: ConnectionError(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/eventManager.cgi?action=getEventIndexes&code=VideoMotion (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at ADDRESS>: Failed to establish a new connection: [Errno 111] Connection refused'))"))
2020-01-11 16:47:54 ERROR (SyncWorker_9) [homeassistant.components.amcrest] Baby Monitor camera offline: Too many errors
2020-01-11 16:47:54 ERROR (SyncWorker_9) [homeassistant.components.amcrest.binary_sensor] Could not update Baby Monitor Motion Detected binary sensor due to error: CommError
2020-01-11 16:48:58 WARNING (SyncWorker_9) [amcrest.http] <SERIAL> Trying again due to error: ConnectTimeout(MaxRetryError("HTTPConnectionPool(host='192.168.1.97', port=1028): Max retries exceeded with url: /cgi-bin/magicBox.cgi?action=getMachineName (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at ADDRESS>, 'Connection to 192.168.1.97 timed out. (connect timeout=3.05)'))"))

What is very odd here is that when the camera reconnects to the other AP I am still able to access the stream, its just that every amcrest entity gets marked as unavailable during this time (including the camera). I have not tested with the online sensor personally as I do not have a use for it. I will try enabling it now to see what the state is. I hope that knowing now that the camera switched access points will be helpful.

My amcrest version is set to 1.5.3 so I will try changing it to 1.5.4 and commenting out those lines for testing too. Just keep in mind I can have HA up and running for days as well and this issue will not crop up, but I can have it up and running for 1 day or less and it comes up. I think the switching of AP's is causing this issue. Prior to your PR, this issue never existed which is why I am suggesting to revert it. Previously before your PR I can have a camera powered off for days and turn it on and HA will reconnect and entities work as expected, so I am not sure what case your PR fixes honestly because HA always reconnected. I will report back the next time this happens as the testing steps are in place now.

pnbruckner commented 4 years ago

Just an update from my perspective.

First I'll say that I haven't analyzed the change, either made in the HA integration, or in the 1.5.4 release of the amcrest package. But I'm concerned they weren't thought through completely. I know I also tested having a camera disconnected while HA was running, having the HA entity go to unavailable, and then back to normal when the camera was reconnected. (And I mean its network connectivity.) That was one of the main reasons behind all the error handling improvements I made.

It is possible I didn't actually test turning the camera off and back on, so I might have missed something there that the change was trying to correct. Now that I think about it more, I do have one camera that I sometimes power off, move and turn back on. I usually call camera.turn_off before and camera.turn_on after. That has always worked just fine. I'll have to I should probably spend more time looking into that.

In my experience the only problem I've had is sometimes a call to the camera.snapshot service doesn't work. This is always temporary. Future snapshot requests work without restarting anything.

So I've been doing some testing today. I enabled all the debug messages and added additional data to them so that I could better understand what might be happening. Then I set up a test system that did nothing but interact with a camera. It had all the binary and regular sensors enabled so that it was constantly polling the camera. I also created an automation that called the camera.snapshot service every 10 seconds.

I noticed several errors with the snapshot, but not with any of the other queries. They seemed to always be due to timeouts. Once I noticed it recovered on the retry. But all other times the retry failed with the connection being closed by the camera. Again, all the sensors still worked and the camera never went to unavailable.

So this made me wonder if the package's _command() method should not create and use a session for the initial query and a retry. So I commented out the code that created a session and changed the get call to simply requests.get. To my surprise, and I've been testing for hours, I got no more errors. I expected that when a timeout occurred that the retry would have a better chance. But apparently it seems to have fixed the cause of the initial error as well.

I'm obviously going to do more testing, including turning the camera off and back on. Hopefully I can submit some changes to the amcrest package to fix the lingering snapshot issue I've been experiencing, and maybe address the issue discussed herein as well.

pnbruckner commented 4 years ago

I analyzed the change in PR 26630, as well as the associated change in 1.5.4 of the amcrest package. I also tested with the latest dev checkout (which uses 1.5.3) as-is. I can verify that powering off a camera, then powering it back on, leads to hung communications. It looks like it tries to use Basic Authentication during the recovery process, and then never tries Digest. But it's left with the Basic Auth object, so it never tries to re-generate again.

I then reverted the commit for 26630 and tested again. This time, when the camera was turned back on, the error recovery worked and everything started working correctly again.

So I can say that the change in PR 26630 isn't correct. It may be necessary to throw out the "token" (which is really an Auth object) and re-generate it when the camera power cycles, at least on some camera models and/or firmware versions, but the implementation isn't correct.

Also regarding the change in 1.5.4, I'll agree that the 1.5.3 code that attempts to generate the "token" didn't handle comm errors correctly. That was an oversight. However, the change in 1.5.4 doesn't either. It's only a partial solution.

BTW, the changes in PR 26630 and 1.5.4 are not equivalent. They do different things.

I will continue to fix & test and hopefully be able to make a new release to the amcrest package soon. Then I will follow up with a PR to the HA amcrest integration. In the meantime I will submit a PR to revert 26630. Again, there may be something that needs to be done to properly support newer camera models and/or firmware versions, but 26630 breaks previously working scenarios.

pnbruckner commented 4 years ago

@antlarr could you test with v1.5.5 of the amcrest package (which I just released today) with 26630 reverted, and let me know how it goes? You can PM me on the HA forum or discord if you like. My username is the same.

I've gotten some indications that some of the newer Amcrest models and/or newer firmware has caused issues. Unfortunately I don't have any to test with. I have three different models, which isn't a lot for testing purposes.

antlarr commented 4 years ago

@pnbruckner Thanks for looking into it. I've been running HA 0.103.6 with amcrest v1.5.4 and PR 26630 applied for the last ~40 hours (looking at the stream from HA each now and then and also generating FTP transfers from the camera due to movements detection) and so far I couldn't reproduce any problem, so I'm glad you could have a look.

I'll test amcrest v1.5.5 (with PR 26630 reverted in HA) during this week (with a little luck, this afternoon or tomorrow afternoon) and let you know the results.

chomupashchuk commented 4 years ago

Hi,

There are few issues with component if camera goes offline:

  1. if camera goes offline for more than 3 seconds it might never recover (possible to overcome with custom component)
  2. if you increase number of retries (with custom components) the sensors for availability take much more time to report no availability without additional updates.

In our region we have unstable electricity provision, so going offline is rather likely.

Normally i would expect component to work in a next way: continuously retry connection establishment. If we have unsuccessful replies for more than monitored period for sensor then report offline, but still continue establishments. Period to establish connection can increase in time till some threshold (10s?) to reduce amount of requests.

pnbruckner commented 4 years ago

@chomupashchuk that is how it used to work before PR 26630 broke it starting with the 0.101.0 release. A fix to revert that change has been submitted, accepted and merged. It was targeted for the 0.104.0 release, but it might not make it until 0.105.0. (It may have missed the 0.104 window.)

I've also made improvements to the amcrest package the amcrest integration uses, and I will submit another PR that bumps the version number to get those changes. I'm also improving the online binary sensor.

chomupashchuk commented 4 years ago

@pnbruckner thanks for reply, good to know that someone works on this component. I'm new to Python and HA, but can mention that in existing component if to increase number of retries for http requests and have camera offline during HA restart the web interface of HA does not start until camera is online or number of retries expires. P.S.: amcrest API works also for Dahua cameras (I am using one and it was mentioned on forums), maybe it can be mentioned in documentation for the component

chomupashchuk commented 4 years ago

Hi,

I have made modification in token handling when camera goes offline in custom_component fully based on amcrest, and now it reconnects well after normal link break (and power off/on) and also during HA restart if camera is offline. At least it works for my Dahua camera:

    def command(self, cmd, retries=DEFAULT_RETRIES, timeout_cmd=DEFAULT_TIMEOUT, stream=False):
        """amcrest.Http.command wrapper to catch errors."""
        try:        
            ret = super().command(cmd, retries, timeout_cmd, stream)
        except AmcrestError:
            with self._wrap_lock:
                was_online = self.available
                self._wrap_errors += 1
                _LOGGER.debug("%s camera errs: %i", self._wrap_name, self._wrap_errors)
                offline = not self.available
                _LOGGER.debug(" token: %S", super().as_dict())
            if offline and was_online:
                _LOGGER.error("%s camera offline: Too many errors", self._wrap_name)
                **with self._token_lock:
                    self._token = requests.auth.HTTPDigestAuth(self._user, self._password)**
                dispatcher_send(
                    self._hass, service_signal(SERVICE_UPDATE, self._wrap_name)
                )
                self._unsub_recheck = track_time_interval(
                    self._hass, self._wrap_test_online, RECHECK_INTERVAL
                )
            raise
pnbruckner commented 4 years ago

@chomupashchuk that's not really the best way to do it. Rest assured I'm right in the middle of making some significant improvements, including the issue of delaying HA startup if the camera is offline. I've made most of the changes and now I'm doing a lot of testing. I should be checking in a new version of the amcrest package soon, followed by a PR for the HA amcrest integration.

antlarr commented 4 years ago

I installed 0.104.2 (which has PR 26630 already reverted) and amcrest 1.5.6 and did some tests.

I could switch the camera on and off and home-assistant reconnected correctly. I never was able to reproduce the problem of the camera suddenly disconnecting randomly but I will keep it turned on for a couple of days to test if it disconnects, just in case.

In any case, this is good news, and so far, everything works perfectly here. Thanks, @pnbruckner .

Mark612 commented 4 years ago

Thank you @pnbruckner. Issue is resolved! Working great again.

loca5790 commented 2 years ago

I am having this same issue. Fresh HA install, I have checked firmware and connections. Static IP set camera is accessible via address but unavailable via HA. Will randomly come back online. Not sure how to check Python version of Amcrest in core. Sorry total beginner.