netmanchris / pyawair

python wrapper for Awair Air Quality Sensor
Apache License 2.0
16 stars 3 forks source link

Add a poller class #9

Closed danielsjf closed 6 years ago

danielsjf commented 6 years ago

I'm almost finished with a first implementation of the home assistant component. In the current implementation, the API calls will be a serious limitation though.

The problem is in the way how it is implemented in home assistant. I have to create a class for each monitored condition. Currently the API is implemented as functions. Each of the sensors will call the API once. Since you can monitor up to 6 conditions (score and the 5 sensors), this means that just an hourly poll will mean 24 * 6 = 144 calls/day.

The solution would be to move the calls to a poller. This poller would have an argument for the polling frequency. I would give the same object to all the sensor classes, but the polling logic would be in the polling class. It would also have a get_state function with arguments like: temp, voc, dust, ... . This would always give the most recent polled state.

What do you think of the idea? If you are willing to add it to your code, I will try to implement something. Would you accept pull requests?

netmanchris commented 6 years ago

@danielsjf Nice! I've started to look into Asyncio, but this is new ground for me so I don't expect this to be quick.

For the functions, would it be possible to have a single call to grab the 6 conditions and capture that as a single VAR which could then be parsed for the 6 sensors? What if I was to provide a class function that grabbed the data with a single API call and then just automatically exposed each data point as a specific attribute of the Class object? Or is that what you're suggesting

pseudocode

X = PollerClass(auth) x.score = 83 x.temp = 23 x.humidity = 49 etc....

I'm totally open to taking PR's if they make sense. This sound like it makes sense and it's just the implementation details.

Q: Seems we designing to work around the number of calls permited by the Awair API? Maybe this is a feature request for @deanlyoung ?

What's the goal you're looking at here, to be useful in HomeAssistant with automations, I don't think we want to go beyond a 15 minute ( probably 5 or 10 would be desirable ) polling period. If we want the automation system to be able to react ( which is my primary Smart Home goal ) we should design for a smaller polling interval.

Best case ( 5 min ) 24 6 12 = 1728 calls/day

10 Min 24 6 6 = 864 calls/day

15 min 24 6 4 = 576 calls/day

Seems like we need to either optimize the API to grab the data in a single call or ask Awair nicely to raise the number of calls for the free tier.

danielsjf commented 6 years ago

@netmanchris I think we are on the same page. The goal of this class is that we would only need one call for all the sensors, instead of 6. However, in the way how this is implemented, it would still work for HA. Therefore you can divide the numbers above by 6.

netmanchris commented 6 years ago

@danielsjf Makes sense. What do you think of the following frmo a design standpoint.

AwairDevice ( device_name,. auth ) device_name would be equal to the device_name of the device auth is the authentication key

self.refresh    - This will refresh all the attributes of the object. 
self.rawdata - This attribute will be captured upon the initial instantiation of a new object of this class. Thinking this would run the get_raw_data() function and capture the entire output in this attribute

self.temp - parse out the temperature from the raw_data and put it in this attribute self.humidity - parse out the humidity from the raw_data and put it in this attribute self.AQI - use the Awair score parsed from rawdata self.dust - use the dust score from rawdata self.co2 - use the co2 score from raw data

If that works I can put Class together for you to take it for a spin to see if it will work for you.

danielsjf commented 6 years ago

Yeah seems good. The links below might also be interesting as a source. MiFlora is a plant sensor that also works with HA.

Code from Python package: https://github.com/open-homeautomation/miflora/blob/master/miflora/miflora_poller.py

Code for interaction with HA: https://github.com/home-assistant/home-assistant/blob/dev/homeassistant/components/sensor/miflora.py

Would you rather develop it yourself or do you prefer that I do the work?

A few more comments/suggestions:

netmanchris commented 6 years ago

@danielsjf

Pip update to the 0.0.6 version and there's a new objects module with a AwairDev class

You should be able to create a new object using the following

x = AwairDev(auth, device_name="Awair_Dev_Name")

Then you can check the attribs to see if you've got the right data and the refresh function should refresh the objects data, including the timestamp so you can verify that you're dealing with the newest data.

Play around with it and let me know how it feels. There are no doc strings on it yet, but we'll fill that and the tests in once I get a thumbs up on the basic implementation.

I decided on using the get_current_air_data call as the single call which then moves all the data out to the different attributes on the Class object.

Feedback is welcome as are PRs. :)

netmanchris commented 6 years ago

@danielsjf LOL I actually have 6 of the Miflora sensors already and more on the way. :)

I'll take a look and see. At least we have a scaffold right now of a class to start with. If we need to throw it away, I'm totally fine with that.

deanlyoung commented 6 years ago

I’ve been working on something similar for Homebridge, and am currently trying to figure out how I will structure the calls. @danielsjf is right that returning sensor values should be combined into as few calls as possible (ideally 1).

I believe, technically, calling once for 1000 timestamps counts as 1 call, so it really depends on your use case. “Real-time” for home automation only needs "?limit=1", but backing up your data / doing a data dump of historical data would be implemented a little differently: fewer daily calls with more timestamps.

netmanchris commented 6 years ago

@deanlyoung @danielsjf For the homeautomation "What is the value now?" call. I believe this can be best accomplished using the get_current_air_data endpoint which does give us all the required data points in a single call.

I built the first attempt at the class using this call and split out the data into attributes of the class object which can be addressed directly rather than having to munge through the data. The full call response is available as well in the event that they want to view the raw response.

Does this structure make sense to you guys?

The calls that return 1000 entries are more useful for the analytics/data science use-cases I think.,

netmanchris commented 6 years ago

@danielsjf I was looking more at the Miflora driver and I'm having trouble seeing howto leverage for this usecase. The Miflora drivers are communicating directly to the device using Bluetooth whereas we are looking at the device's cloudbased REST API.

Because the Awair API is exposing the data through a single REST call, we don't need to deal with all of the low level hardware/protocol stuff that the Miflora driver is using to communicate with the Miflora flower sensors directly.

Does that make sense? I haven't done any driver development for HassIO so I'm definitely not convinced of my own opinion here. :)

danielsjf commented 6 years ago

@netmanchris I've added some additional functionality in my own fork here: https://github.com/danielsjf/pyawair

I've added some logic to only refresh values requested by get_state if the cache period has expired. The problem is that I don't know how often HA will request the value. In this way we can make sure that the API is not used too often.

This is also the reason why I referred to the miflora sensor. You are right and the low level functions are not very interesting. The lock is also not necessary I think. However the cache is quite useful.

I would still like to clean up the variables. Preferably, there is only one data field with a list instead of many variables. I might change it in a future refactoring.

Have a look and let me know if you like it. Don't hesitate to create a pull request.

danielsjf commented 6 years ago

Comitted some more changes to my pyawair repo. Also added an initial version of the component here: https://github.com/danielsjf/awair_component

Be aware that I have no control on how HA uses the update function. With the current pyawair version, this might be much more frequent than every 15 minutes. Therefore we need at least some form of caching (feel free to change or improve things compared to my repo).

netmanchris commented 6 years ago

@danielsjf Are you mostly making changes on the objects.py content? If so, make the changes you need to best make your usecase and I’ll accept any PR. The object class wasn’t really something that I have a strong need for with my specific use-case ( more of a data science application ) so I’m more than fine with you driving the class to best suit your HASSIO application.

If there’s anything specific you want me to look at, comment on, or something specific in the other folders that you need modified, just let me know. Once you get the class object done, I’ll also add in the tests for it so once I add the Jenkins piece, we can make sure we don’t accidentally break it. :)

danielsjf commented 6 years ago

@netmanchris yes indeed, I changed only the object class. At the moment I have all the functionality present in the version that you've merged now. If you want you can push it to pypi.

netmanchris commented 6 years ago

Sounds good. I’ll update the version and push to Pypi tomorrow.

Reminds me I need to create a version function too. :)

netmanchris commented 6 years ago

@danielsjf Should be available on pypi as version 0.0.7.

deanlyoung commented 6 years ago

Nice. 007...

danielsjf commented 6 years ago

@netmanchris @deanlyoung in the last version on my repository, awair syncing with HA is working successfully. I'm not integrating it by default in HA until the API is final. Any idea when this might be?

My two cents on the API, the 100 calls per day in the lowest tier is fine as long as it is 100 calls per day per device. I see no reason to punish people for buying multiple devices :-)

On top of this, I would also exclude the device queries from the limit or set a separate high limit on those. I would make it lower than the data limit to stop people from using it in every data call, but high enough so that occasionally rebooting your HA hub a dozen times won't hurt your limits.

deanlyoung commented 6 years ago

@danielsjf the 100 calls limit is currently per endpoint. I’m not sure however if that is per air-data endpoint or across all air-data endpoints. Will have to check.

That’s definitely good feedback Re: per endpoint per device. Will have to think about that, best way to approach this.

As for the endpoints themselves. I don’t foresee these changing from what you see already. The structure will remain the same. We will definitely be expanding upon them though. Any particular changes that might break things?

netmanchris commented 6 years ago

@danielsjf Which API are you refering to, the python API or the Awair API. If it's the Python API of the pyawair library, you are the primary user of the class object which means if you're happy with it, I'm happy with it. :)

If you want to call it done, I have no problem locking it down. might need to add some additional docstrings and tests etc.. but if you're happy with the class codebase itself, we can document that specific class as "finished". Thoughts?

To be clear, there's other parts of the library that I still need to add, as well as adding the Oauth portion for some of the enterprise apps so that part would still be in a bit of a state of flux. Are you ok with that?

Chris

danielsjf commented 6 years ago

@netmanchris Yes this is good for me. I might add some stuff in the future, but the current structure is good.

@deanlyoung as you can see below, the data for the first 24 hours worked successfully for a 15 minute poll time. I didn't interacted with the endpoints directly since I only used the Python code. However, I have no direct requests. So far everything works fine other than correcting it to 100 calls/day/device.

In that case I would come back on my earlier suggestion. Maybe you can limit the device endpoint to 20 calls/account and each data endpoint (current, 5 min avg, 10 min avg,...) to 100 calls per device. This is easy as the devices are in the call. Don't know for the other future additions, but I guess it will always fit in one of the two previous categories.

deanlyoung commented 6 years ago

So, technically each of the "air-data" endpoints currently has individual limits of 100 calls/day. It’s looking like we’re going to limit the lowest tier to "latest" and "15min-avg" w/ 100 calls each, at least for now. Definitely taking note of the "100 calls per day per device" request. 👌🏻 Will assess demand and effort needed to implement that.

danielsjf commented 6 years ago

Alright, I wasn't aware of that. I assumed it was a general limit. No further comments then! Thanks for making such a great product ;-)

netmanchris commented 6 years ago

@danielsjf @deanlyoung So I can close this issue? We agree that we're going to look this down for HassIO support? Ongoing changes be restricted to docstrings and test. Input and Output of the API should remain unchanged.

If you guys can please give me a thumbs up, I'll close this issue out.

danielsjf commented 6 years ago

I am considering adding another argument for current, 15 min or 5 min, but that is an addition. The existing functionality remains unchanged. You can close this.

danielsjf commented 6 years ago

@netmanchris this seems to be finished. I'll close this, but feel free to reopen if anything is still to be done here.