rbaron / b-parasite

🌱💧 An open source DIY soil moisture sensor
1.85k stars 143 forks source link

Reduced signal strength with BLE nrf-connect #98

Closed jhbruhn closed 1 year ago

jhbruhn commented 1 year ago

I finally got around to testing the new nrf-connect firmware properly. Very nice work with the streamlining for different protocols¹!

Unfortunately though, I have the feeling that the transmit power is very reduced compared to the "original" firmware, which leads to a lot of missed data points² in my Home Assistant instance. Is there a setting I am missing? There is an entry in the prj.conf, but I couldn't find a hint about that somewhere or in the ble.{c|h} files.


¹I am especially excited for a potential thread/matter variant. ²Or could it be something else, like HA invalidating the data to quickly with BTHome v2?

rbaron commented 1 year ago

I also see some small data gaps with my BTHome v2 sensors (with ESPHome proxy + Home Assistant). I haven't dug in much yet, and I don't know if the issue is in b-parasite side, ESPHome side or HA side.

We didn't have BTHome v2 in the old firmware, but one notable different parameter is the advertising interval (not to be confused with the duration it spends advertising - default 1 s). In the old firmware we used an aggressive 30 ms, while now we use the default in the [100, 150] ms range (implicitly set by BT_LE_ADV_NCONN_IDENTITY). The first thing I would try would be to match the 30 ms and see if that helps. While debugging I would also bump the duration from 1s to 10s or so and lower the sleeping duration.

While my sensors are pretty close to an ESP32 running a ble proxy, sometimes I see "too many BLE events to process, dropping some", so also that could be one issue. Bumping those params above should help in this case too.

Another long shot is HA is dropping some messages as duplicates, and our lack of explicit counter in the BTHome packet may have something to do with it.

rbaron commented 1 year ago

I've made it a bit easier to try different advertisement intervals in #102. I'm currently running a side-by-side test with one b--parasite running the default values for BTHomeV2 and another one with the following overrides:

+CONFIG_PRST_BLE_ENCODING_BTHOME_V2=y
+CONFIG_PRST_BLE_ADV_DURATION_SEC=3
+CONFIG_PRST_BLE_MIN_ADV_INTERVAL=30
+CONFIG_PRST_BLE_MAX_ADV_INTERVAL=40

Both talking to the same ESPHome BLE proxy in an adjacent room. I will report back when I have some data.

As a side note, using 30ms/40ms advertisement intervals bump the avg current consumption during advertisement from ~350 uA to ~800 uA:

Screen Shot 2023-01-28 at 11 40 02

I think it's still more than fine for a couple of seconds on a 5 minutes cycle.

rbaron commented 1 year ago

With the 30/40ms sensor, a related issue is more apparent:

image

As far as I can tell, HA is receiving a new data point as expected, in 10 minutes intervals. But it marks the sensor as unavailable a few minutes after each reception. I found an issue (that coincidentally @jhbruhn created) and it seems like the timeout should be 15 minutes, but I'm seeing < 10 minutes.

So apparently there are two different issues here:

  1. The range issue you're experiencing
  2. The HA issue of marking my sensors as unavailable

For the range issue, would you give the following settings a try @jhbruhn? It should work on the current main branch:

CONFIG_PRST_BLE_ENCODING_BTHOME_V2=y
CONFIG_PRST_BLE_ADV_DURATION_SEC=3
CONFIG_PRST_BLE_MIN_ADV_INTERVAL=30
CONFIG_PRST_BLE_MAX_ADV_INTERVAL=40
rbaron commented 1 year ago

For this next range test, I placed two b-parasites barely in range of my ESPHome proxy (two rooms over in my apartment):

The one with default settings does pretty badly:

image

The one with the following settings:

CONFIG_PRST_BLE_ENCODING_BTHOME_V2=y
CONFIG_PRST_BLE_ADV_DURATION_SEC=3
CONFIG_PRST_BLE_MIN_ADV_INTERVAL=30
CONFIG_PRST_BLE_MAX_ADV_INTERVAL=40

does better:

image

Most of the gaps are likely due to HA being too aggressive in marking it as offline.

So, in short, it seems that:

jhbruhn commented 1 year ago

I just tried this with two sensors, and it did indeed fix the problem (at least looking at the past 30 minutes). I set it to an advertisement duration of 2 seconds to at least save some more battery :)

But indeed I agree that the Duration after which HA declares the data as invalid is surprisingly short! IMO it should be set to at least 60 minutes. Perhaps a special case (or a yet to be added BTHome property!) could be set to indicate such a value for b-parasites?

rbaron commented 1 year ago

I ran another test after updating the ESPHome bridge and Home Assistant core (2023.2.3). I used a b-parasite with [30,40] ms advertising interval and 10 minutes sleep period.

Screen Shot 2023-02-09 at 08 27 27

With the update (not sure if it helped) and these settings, I see the expected behavior: we're mostly okay with 10m sleeping, but when we miss one transmission, then HA marks it as unavailable. This is in line with the 15m window mentioned in the HA issue.

So if we assume HA is working as intended and live with the fact that sometimes we'll miss some transmissions (a good assumption IMO), we have two options:

  1. Tweak the HA unavailable time. For us even 1h would be okay
  2. Decrease our transmission period

Option 2 is in our hands and easy to do. I'm not too concerned about battery, even if we go as low as 3 minutes (assuming the default adv interval):

image

We can probably also lower the transmission interval to [30, 40] ms and still theoretically get multiple years of battery life:

image

While we don't really need up-to-date plant data every 3 minutes, I'm fine spending the extra power budget in the name of reliability while we can't go for option 1. And we always have the option to tune these params, but maybe the default should err on the side of "it just works".

What do you think?

jhbruhn commented 1 year ago

I think this approach is entirely valid and would be the easiest solution.

My main concern with increasing the sleep interval is not battery life (as that is already very good!), but actually RF-pollution. With around 20 b-parasite sensors, some BLE temperature sensors, a ZigBee network, 2.4GHz epaper-pricetags and of course WiFi networks, there is a lot of traffic in the 2.4GHz band, stomping on each others feet. Because of that, I'd actually be interested in even increasing the sleep interval to something like an hour because, as you seed, who needs plant moisture information updated every 3 minutes?

As of this, my recommendation is to change the advertisement interval to [30,40]ms to be inline with the previous firmware iteration (and its signal reliability), and in the long run propose a BTHome datapoint which encodes the sleep interval of the sensor, with which Home Assistant can then infer a invalidity interval. If we even have space left for such a datapoint in the advertisement packet?

rbaron commented 1 year ago

Agreed that bumping the HA expiration should be the best solution. This can be achieved in a few ways I think:

  1. Adding an TTL field to BTHome
  2. Making HA expiration configurable in the UI (not my favorite, state is brittle)
  3. Making HA "learn" the transmission period and mark it as unavailable only if it misses > k transmissions

RF-pollution

Fully agree that less is more, but we're still orders or magnitude less polluting than most BLE devices in the market. Except for beacons, most BLE devices are connectable, so they must constantly be advertising. Our radio duty cycle for 3m sleep would be essentially 1s / 180s (right now it's 1/600).

rbaron commented 1 year ago

It seems like the update of HA core did help - there was a bug related to expiration of non-connectable devices home-assistant/core#85701. Fixed last month by a b-parasite user - pretty sweet!