zigpy / zigpy

Library implementing a ZigBee stack
GNU General Public License v3.0
799 stars 156 forks source link

OTA reliability improvements #1346

Closed puddly closed 4 months ago

puddly commented 4 months ago

Changes required to get OTA working reliably on a few more real devices:

  1. Increase the "max time without progress" timeout from 10s to 30s: Hue bulbs can stall for 20s during OTA.
  2. Do not send OTA progress if the upgrade has failed: if a battery-powered sensor rejects the image before we get an ACK for the last block, OTA can fail and then emit progress, confusing ZHA.
  3. Read the current file version after OTA concludes.

3 is a little hacky but it works. In the future, we should read the software version and other basic information during a join/rejoin and fully re-initialize the device if this information changes. This would require some sort of method to "clear" a device from the database without fully removing it.

codecov[bot] commented 4 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.49%. Comparing base (1853ce1) to head (88bd265).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #1346 +/- ## ======================================= Coverage 99.49% 99.49% ======================================= Files 55 55 Lines 10307 10311 +4 ======================================= + Hits 10255 10259 +4 Misses 52 52 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

thk-socal commented 4 months ago

Running this version via HA 2024.3.0b1 and having timeout issues updating Hue bulbs still.

2024-02-29 00:59:22.273 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [139694811363008] Update was not successful:
 <Status.FAILURE: 1>
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/websocket_api/commands.py", line 239, in handle_call_service
    response = await hass.services.async_call(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2304, in async_call
    response_data = await coro
                    ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/core.py", line 2341, in _execute_service
    return await target(service_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 905, in entity_service_call
    single_response = await _handle_entity_call(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/service.py", line 975, in _handle_entity_call
    result = await task
             ^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/update/__init__.py", line 161, in async_install
    await entity.async_install_with_progress(version, backup)
  File "/usr/src/homeassistant/homeassistant/components/update/__init__.py", line 465, in async_install_with_progress
    await self.async_install(version, backup)
  File "/usr/src/homeassistant/homeassistant/components/zha/update.py", line 198, in async_install
    raise HomeAssistantError(f"Update was not successful: {result}")
homeassistant.exceptions.HomeAssistantError: Update was not successful: <Status.FAILURE: 1>

The install of the firmware will recover and continue where I left off. Might need to extend the timeout even farther on these bulbs. This one in particular is the BR30 E26 color bulbs. I have some others I can try as well.