Danielhiversen / pyAirthings

Python Airthings
MIT License
5 stars 1 forks source link

Add the possibility to ignore devices (rate limit issue) #4

Open LaStrada opened 1 year ago

LaStrada commented 1 year ago

There should be a way of not fetching all devices. Either by adding a ignore_devices parameter or by splitting the update_devices into:

And then Home Assistant (or other implementations) can fetch the device list and then decide which devices that should be fetched.

The reason for this is that the rate limit is set to 120 requests per hour. If the user has 11 devices, this will be 12 requests (device list + all devices) every 6 minutes = 120 requests per hour. But the timing can be wrong, so this will most likely hit the rate limit at some point. This means that 10 devices is the max devices the user can have. There are not many (disclaimer, I work for Airthings) users complaining on this, but there are some users that are using this integration and have issues with the rate limit.

Not all users need to fetch all devices. Right now it will fetch everything, even if the user disables a device in Home Assistant. This will not fix the rate limit itself, but gives the user to ignore one / multiple devices if the user hits the rate limit.

stevefal commented 7 months ago

I have hit this issue as I have 15 devices (Wave Plus). In my case 'ignore_device' would not be helpful because I don't intend to ignore devices. Otherwise why did I buy them?

I am using Home Assistant, and I have alleviated this problem for now by rolling my own HA integration and lengthening the polling interval. I also agree that some optimizations may be done to make the query flow more efficient, but ultimately I don't think that solves the root problem.

Not a code issue per se, but I think the core rate limit issue is Airthings' capping requests/hour on a client basis versus a device basis. It is not fair that a customer who paid X for one device can update information once per minute, whereas I who paid 15X for 15 devices still receive 1X worth of cloud API service, and can therefore update only once per 7.5 minutes.

This is not to mention that when integrating, it's not unusual to restart a system a few times, which has the effect of accelerating the count and causing query failures when they are most unwelcome - while troubleshooting a new integration.

I like the suggestion to split update_devices into separate methods - enumeration and sensor data. By definition, sensor data is highly dynamic and deserves the most resources for timely reporting. On the other hand, the device namespace is almost completely static, only to change when the customer buys or reconfigures devices. After final configuration, enumeration results do not change.

A practical implementation of this split in HA would be to have a separate polling interval for enumeration, say 1 hour. In this case, the one-device customer could poll sensors almost every 30 seconds at the same rate limit. Or Airthings could reduce the limit to 61/hour, and reduce load by almost 50% without any loss of functionality to customers. And just to remember, an impatient HA user could restart HA if they want to force re-enumeration before the otherwise scheduled time.

In my case, combining the split methods with a per-device limit of, say, 70/hour I might poll my 15 devices once per 5 minutes and generate 12/hour per device and total 181 queries per hour. A customer with 2 devices and the same regime would generate only 23/hour total, versus 36/hour using the combined update_devices.

I realize all these suggestions imply code work in the library and HA integration, or new versions of one or both. But I feel that there is value in advancing Airthings in home automation scenarios. In my case, the sensors are the basis of a robust air quality solution involving sensing and energy-optimized control of temperature, radon, CO2, and humidity via automation of heat, dehumidification and active and passive ventilation.

stevefal commented 7 months ago

I think splitting device enumeration from getting sensor data may have been a red herring relative to the Home Assistant scenario. On further investigation, update_devices does not fetch devices unless _devices is empty. The HA integration I'm using (modified version of Daniel's cloud-based integration) persists _devices, which means not only should it not call the cloud on every update, you have to manually reload the integration to force a new device list. I'm not sure if this might also happen automatically upon some other HA event.

I think this is ultimately good news because, if correct, the flow is already more efficient than assumed in the HA case.

My issue with per-client versus per device rate limit still pertains though. I hope Airthings will multiply whatever limit constant by num_devices on the back end.

Extreme case to drive the point - if I bought 130 Wave+ and hubs, costing over 30.000 EUR and making me a super-mega customer, today I would not be able to read them all even once within an hour.

(related to https://github.com/Danielhiversen/pyAirthings/pull/5)