home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
69.03k stars 28.28k forks source link

Airthings-ble still unable to read after 2024.5.1 update #116770

Closed JesusSanchezLopez closed 1 week ago

JesusSanchezLopez commented 2 weeks ago

The problem

2024.5.1 update implemented retries for pulling data from airthings-ble sensor. This has provided major improvement for the connection, however I am still seeing an occasional disconnect.

From log: 2024-05-03 22:56:22.462 ERROR (MainThread) [homeassistant.components.airthings_ble] Error fetching airthings_ble data: Unable to fetch data: Disconnected from FC:A8:9B:F2:CA:CD 2024-05-03 23:43:08.505 ERROR (MainThread) [homeassistant.components.airthings_ble] Error fetching airthings_ble data: Unable to fetch data: Disconnected from FC:A8:9B:F2:CA:CD at

I just now, I have enabled debug for bluetooth and airthings-ble hopefully I can catch additional details.

What version of Home Assistant Core has the issue?

2024.5.1

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

airthings-ble

Link to integration documentation on our website

https://www.home-assistant.io/integrations/airthings_ble/

Diagnostics information

No response

Example YAML snippet

No response

Anything in the logs that might be useful for us?

No response

Additional information

No response

home-assistant[bot] commented 2 weeks ago

Hey there @vincegio, @lastrada, mind taking a look at this issue as it has been labeled with an integration (airthings_ble) you are listed as a code owner for? Thanks!

Code owner commands Code owners of `airthings_ble` can trigger bot actions by commenting: - `@home-assistant close` Closes the issue. - `@home-assistant rename Awesome new title` Renames the issue. - `@home-assistant reopen` Reopen the issue. - `@home-assistant unassign airthings_ble` Removes the current integration label and assignees on the issue, add the integration domain after the command. - `@home-assistant add-label needs-more-information` Add a label (needs-more-information, problem in dependency, problem in custom component) to the issue. - `@home-assistant remove-label needs-more-information` Remove a label (needs-more-information, problem in dependency, problem in custom component) on the issue.

(message by CodeOwnersMention)


airthings_ble documentation airthings_ble source (message by IssueLinks)

JesusSanchezLopez commented 2 weeks ago

@jaydeethree: Are you seeing any issues after updating to 2024.5.1?

jaydeethree commented 1 week ago

I haven't updated to 2024.5.1 yet. I should have time to update and test this in the next few days.

bdraco commented 1 week ago
def1149 commented 1 week ago

After the update, I haven't had any instances of values becoming "unknown" or breaks in the data on the data graph in the last ~22 hours. Monitoring with DEBUG enabled continues

HA Yellow with CM4 w/ on-board BT Wave+ 5 feet away through one US type sheet-rock wall

Note: a typical connection occurs every five minutes and takes 3.5 to 6.5 seconds, 95% about 4.5 seconds

I've seen one instance where retries recovered initial connection failure 2024-05-03 14:30:27.730 WARNING (MainThread) [homeassistant.components.airthings_ble] Timeout getting command data. 2024-05-03 14:30:27.731 WARNING (MainThread) [homeassistant.components.airthings_ble] Wrong length data received (0) versus expected (28) 2024-05-03 14:30:30.482 DEBUG (MainThread) [homeassistant.components.airthings_ble] Disconnected from F4:60:77:XX:XX:XX 2024-05-03 14:30:30.483 DEBUG (MainThread) [homeassistant.components.airthings_ble] Finished fetching airthings_ble data in 9.563 seconds (success: True)

I've seen five instances where retries recovered from unexpected disconnect 2024-05-03 15:46:46.272 DEBUG (MainThread) [homeassistant.components.airthings_ble] Disconnected from F4:60:77::XX:XX:XX 2024-05-03 15:46:54.486 DEBUG (MainThread) [homeassistant.components.airthings_ble] Disconnected from F4:60:77::XX:XX:XX 2024-05-03 15:46:54.488 DEBUG (MainThread) [homeassistant.components.airthings_ble] Unexpectedly disconnected from F4:60:77:79:82:82 2024-05-03 15:47:04.481 DEBUG (MainThread) [homeassistant.components.airthings_ble] Disconnected from F4:60:77::XX:XX:XX 2024-05-03 15:47:04.482 DEBUG (MainThread) [homeassistant.components.airthings_ble] Finished fetching airthings_ble data in 19.563 seconds (success: True)

I've seen three instances of the following with no apparent impact on connectivity 2024-05-04 01:08:35.928 DEBUG (MainThread) [homeassistant.components.airthings_ble.config_flow] Discovered BT device: <habluetooth.models.BluetoothServiceInfoBleak object at 0x7f57f085c0> 2024-05-04 01:12:00.189 DEBUG (MainThread) [homeassistant.components.airthings_ble.config_flow] Discovered BT device: <habluetooth.models.BluetoothServiceInfoBleak object at 0x7f4f995c40>

def1149 commented 1 week ago

I recommend a user-settable number of retries as users are effectively doing the same thing with automation hacks. This will allow users with a lot of interference to try to compensate. A user-settable timeout may also be useful

The current default settings worked for me but other situations may require more adjustability

JesusSanchezLopez commented 1 week ago

@bdraco Which Bluetooth adapter are you using? - Raspberry Pi 4 Bluetooth adapter

Have you followed the steps for reducing interference https://www.home-assistant.io/integrations/bluetooth/#simple-actions-that-should-improve-most-bluetooth-setups-and-common-root-causes-of-interference? - Yes

Do you have your Bluetooth adapter on an extension cable or it it plugged directly into your device? - Integrated on Raspberry Pi 4

How far away is the airthings device away from the Bluetooth adapter? - Approx 20 feet


I caught 2 gaps at 7:56 AM and 1:30 PM (full log attached):

2024-05-04 07:56:03.493 DEBUG (MainThread) [homeassistant.components.airthings_ble] Finished fetching airthings_ble data in 66.608 seconds (success: False) 2024-05-04 13:30:06.429 DEBUG (MainThread) [homeassistant.components.airthings_ble] Finished fetching airthings_ble data in 43.543 seconds (success: False)

at2 airthing.txt

JesusSanchezLopez commented 1 week ago

I recommend a user-settable number of retries as users are effectively doing the same thing with automation hacks. This will allow users with a lot of interference to try to compensate. A user-settable timeout may also be useful

The current default settings worked for me but other situations may require more adjustability

I agree, user configurable values would be great. Current settings work most of the time, but my hunch is one extra retry would make it work 99.99% of the time.

bdraco commented 1 week ago

Generally we don't add configuration options to change timeouts or retries as that is something the integration would handle. That type of change would be rejected in code review with a request to fix it in the integration instead.

Increasing to 3 retries is probably a reasonable path forward.

If we do that we should only do one retry on startup and than do the extra retries only after successful setup so we don't end up delaying startup.

bdraco commented 1 week ago

Do you have your Bluetooth adapter on an extension cable or it it plugged directly into your device? - Integrated on Raspberry Pi 4

How far away is the airthings device away from the Bluetooth adapter? - Approx 20 feet

While the drivers have improved over time, the internal RPIi adapters in these devices are not so great so its expected that you will get connection drops and since they are close to the USB ports that generate interference, the board design doesn't help.

The problem will likely completely go away if you disable the internal adapter and replace with an ESPHome Bluetooth proxy or an external adapter on the high performance list.

bdraco commented 1 week ago

https://github.com/Airthings/airthings-ble/pull/38

bdraco commented 1 week ago

https://github.com/home-assistant/core/pull/116805

bdraco commented 1 week ago

Since the scan interval is 300 seconds, we can even go to 5 attempts before declaring failure since the maximum connection time is 60s

JesusSanchezLopez commented 1 week ago

Thanks for increasing number of retry attempts! Ideally this issue is fixed for everyone with an RPi, but as a last resort I do have an esp32 set up as Bluetooth proxy waiting to be plugged in.

Thank you again for supporting us with sub-optimal setups!

JesusSanchezLopez commented 1 week ago

Since the scan interval is 300 seconds, we can go to 5 attempts before declaring failure since the maximum connection time is 60s

Wouldn't 5 retries in 300 seconds be too much since next scan would immediately happen? Max 4 retries?

bdraco commented 1 week ago

It would mean it would have to timeout connecting every time which isn't the problem here. If we start hitting that timeout, the device is likely truly unreachable. The data exchange timeout is much lower so it shouldn't be a problem

Also it can't miss the refresh as it always schedules the next one once the current one finishes https://github.com/home-assistant/core/blob/65120e5789f4d5c6371499b62d4ccf442411e019/homeassistant/helpers/update_coordinator.py#L213

The risk of setting it too high would mean it would take a long time to notice the device is unavailable

def1149 commented 1 week ago

After 48 hours I've not had any data dropouts. For me, it looks like the retries have fixed the problem. Thanks for the fix.

BTW, I have other BT devices even farther away (25 ft vs 5 ft for Airthings) that never experience data dropouts which casts some doubt on the USB interference theory. If you can't measure it, it's just a WAG (Wild Ass Guess).

It may be that the Airthings BT chip or device interface code is the problem. An engineer with the appropriate BT expertise and equipment can monitor/measure and get to the root cause.

But if that never happens hopefully the retry fix holds up over the long term. Time will tell and I'll be back if the problem reoccurs. In the meantime, I'll keep my failure detection automation that triggers when C02 goes unknown for >60 seconds, sets a persistent notification so I know there was a data drop, and restarts the integration.

bdraco commented 1 week ago

BTW, I have other BT devices even farther away (25 ft vs 5 ft for Airthings) that never experience data dropouts which casts some doubt on the USB interference theory. If you can't measure it, it's just a WAG (Wild Ass Guess).

I don't know which other Bluetooth devices you are referring to, but that may be because Airthings devices need a GATT connection to update. Many other Bluetooth sensors use advertisements which do not have the overhead of two-way connection establishment, which also means they usually have a much longer range.

Bluetooth advertisements are generally limited to 31 bytes of data. Hence, some vendors use GATT instead or some combination of advertisements and GATT connections to update data that changes infrequently, like battery %.

Most vendors that use GATT connections have a stated range of 20ft. I'm not sure what Airthings specifies, though.

jaydeethree commented 1 week ago

@jaydeethree: Are you seeing any issues after updating to 2024.5.1?

I haven't had any drop-outs since updating to 2024.5.1 and switching back to the built-in airthings_ble integration, so it seems like the recent changes fixed everything for me :) Thanks so much @bdraco !

dxmnkd316 commented 1 week ago

I'm not sure if this adds any info to the ongoing issues, but last night I got annoyed and just decided to start over. Backed up the database, deleted both Airthings devices, deleted both Airthings apps on my phone, readded the devices in rhe Airthings BLE integration, and renamed the entities so they pointed back to the original entity names.

I have no idea if it was deleting the apps that helped or if it was deleting the integrations, but since then I haven't had any dropouts that weren't fixed by reloading the AT BLE integration. I'm still on 2024.5.1 and was only getting maybe one or two reading a day from one of my devices. I had an automation that would reload airthings BLE after ten minutes of unavailable then after another ten minutes would reload the Bluetooth integration then the AT BLE.

im wondering if the apps were somehow taking priority over the HA connection or if it was something else entirely. But I'm going to wait to update to 5.2 to see if I start having dropouts after a day or two.

JesusSanchezLopez commented 1 week ago

@dxmnkd316 - I have both Airthings apps installed on my phone (iphone), so I don't think that has any impact. 2024.5.1 fixed majority of the drop outs for me (only had 2-3 missed data points over 24 hours). I installed 2024.5.2 last night and have not seen any drop outs over last 12+ :)

Thank you again @bdraco for all your help!

dxmnkd316 commented 1 week ago

So I might just have a bad connection in the original location. Which is odd, because there are other Bluetooth devices on the far side of that room that connect fine.

Anyways, I started to see dropouts again later in the evening. So I moved the device two feet from the bedroom to the hallway, towards the raspberry pi. I haven't had a drop since.

Perhaps more testing on my part is due. But unsurprisingly it seems like bdraco and lastrada were onto something about the flakey connection and the timeouts. I might have found the edge of the range.

JesusSanchezLopez commented 1 week ago

@dxmnkd316 Upgrade to 2024.5.2 and see if your issue is resolved. The extra retries seem to have resolved my issue without moving either RPi or Airthings Wave+ (approx 20 feet apart). I had no drop outs in last ~19 hours since I've upgraded to 2024.5.2

def1149 commented 1 week ago

Since 2024.5.1 I've had no dropouts. I'm on 2024.5.2 since release

dxmnkd316 commented 1 week ago

It's been three days since i've had any dropouts. I decided to update to 2024.5.2 as well.

Great work @bdraco & @LaStrada