petergridge / openweathermaphistory

A home assistant sensor that uses the OpenWeatherMap API to get forecast, current obs and history data
57 stars 5 forks source link

V3 api, more flexible formulas, persist storage #14

Open tsbernar opened 1 year ago

tsbernar commented 1 year ago

Add support for custom formulas and multiple factors. Make multiple sensors instead of extra state attributes. Make sensors SensorStateClass.MEASUREMENT for graphing and statistics support. Persist the rolling window of data to save API calls after restarts and allow for a larger lookback window.

tsbernar commented 1 year ago

Rebased with main repo and added a few more features:

*Implement new sensor type, total_rain. -Takes a start and end offset and returns total rainfall between them.

petergridge commented 1 year ago

With the new repository I get

2023-04-23 02:53:41.114 ERROR (MainThread) [homeassistant.components.sensor] Error while setting up openweathermaphistory platform for sensor
Traceback (most recent call last):
  File "/workspaces/core/homeassistant/helpers/entity_platform.py", line 304, in _async_setup_platform
    await asyncio.shield(task)
  File "/workspaces/core/config/custom_components/openweathermaphistory/sensor.py", line 172, in async_setup_platform
    await _async_setup_v3_entities(add_entities, hass, config, units)
  File "/workspaces/core/config/custom_components/openweathermaphistory/sensor.py", line 233, in _async_setup_v3_entities
    await sensor_registry.async_load()
  File "/workspaces/core/config/custom_components/openweathermaphistory/sensor.py", line 403, in async_load
    await self._weather_history.async_load()
  File "/workspaces/core/config/custom_components/openweathermaphistory/weatherhistory.py", line 103, in async_load
    if data["hour_rolling_window"]:
KeyError: 'hour_rolling_window'

I guess your json structure has changed, any hints to clear the persisted data

tsbernar commented 1 year ago

Dang. I’ll fix that, but I’m away from my computer right now.

For now, you should be able to delete the file under .storage/openweathermaphistory.history (STORAGE_KEY in the const file)

petergridge commented 1 year ago

That helped, moving onto the next issue :) I love testing other peoples code, sure beats people finding bugs in mine.

2023-04-23 03:16:30.565 ERROR (MainThread) [homeassistant] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/workspaces/core/config/custom_components/openweathermaphistory/weatherhistory.py", line 156, in backfill_chunk
    await self._async_update_for_datetime(end_dt)
  File "/workspaces/core/config/custom_components/openweathermaphistory/weatherhistory.py", line 231, in _async_update_for_datetime
    return self.add_observation(data)
  File "/workspaces/core/config/custom_components/openweathermaphistory/weatherhistory.py", line 239, in add_observation
    rain = json_data["rain"]["1h"] if "rain" in json_data else 0
KeyError: '1h'

interesting that the data returned from the API has 'rain': {'3h': 1} not 'rain': {'1h': 1}

{'dt': 1681714800, 'sunrise': 1681720747, 'sunset': 1681761041, 'temp': 18.48, 'feels_like': 18.08, 'pressure': 1013, 'humidity': 65, 'dew_point': 11.79, 'clouds': 34, 'wind_speed': 5.33, 'wind_deg': 338, 'wind_gust': 5.73, 'weather': [{'id': 500, 'main': 'Rain', 'description': 'light rain', 'icon': '10n'}], 'rain': {'3h': 1}}

Another question, what is the behaviour if I set up two sensors with different locations? How will the persistent storage work and API counts.

petergridge commented 1 year ago

if it helps this is the URL/location I am running for:

url: https://api.openweathermap.org/data/3.0/onecall/timemachine?lat=-33.8715&lon=-33.8715&dt=1681714800&appid={API_KEY}&units=metric

weatherhist.py line 68, you have both lat and lon using latitude.

petergridge commented 1 year ago

https://openweathermap.org/history tells me that 3h is the rainfall for the last 3 hrs, so we need to subtract the previous 2 hours rainfall to get this hours rainfall. Why would they do this to us?!

tsbernar commented 1 year ago

Thanks! This is all really helpful debugging info.

Very annoying that they have the '3h' rain samples; I hadn't come across that yet in my location and didn't see it in the v3 docs. It looks like they mix '3h' and '1h' and then also report the same number 3 times in a row for the '3h'

image

I'm not quite sure what to make of this; what do you think the "correct" total rain is in this period?

image

Maybe 1.0 + 1.13 + 1.0 + 1.31 ?

tsbernar commented 1 year ago

For your other comments:

sensor: 
  - platform: openweathermaphistory
    api_key: 'key'
    v3_api: True
    max_api_calls_per_hour: 30
    max_api_calls_per_day: 200
    lookback_days: 30
    resources:
      - name: rainfactor_default_location
        type: default_factor
        data:
          watertarget: 0.5
      - name: rainfactor_with_custom
        type: custom
        data:
          formula: 'max( (0.5 - day0rain - day1rain/2 - day2rain/4 - day3rain/8 - day4rain/16) / 0.5, 0)'
      - name: 48hr_rain
        type: custom
        data:
          formula: day0rain + day1rain
  - platform: openweathermaphistory
    api_key: 'key'
    v3_api: True
    max_api_calls_per_hour: 30
    max_api_calls_per_day: 200
    lookback_days: 6
    latitude:  -33.8302547
    longitude: 151.1516128
    resources:
      - name: rainfactor_aus_locatoin
        type: default_factor
        data:
          watertarget: 0.5
      - name: rainfactor_with_custom_aus
        type: custom
        data:
          formula: 'max( (0.5 - day0rain - day1rain/2 - day2rain/4 - day3rain/8 - day4rain/16) / 0.5, 0)'
      - name: 24hr_rain_aus
        type: custom
        data:
          formula: day0rain

The persistence will now store in a file with the location included in the name.

I'm open to suggestions on how to handle multiple locations better, I only have 1 location for mine. Maybe you could override the location at each sensor instead of setting up a new platform like this?

sensor: 
  - platform: openweathermaphistory
    api_key: 'key'
    v3_api: True
    max_api_calls_per_hour: 60
    max_api_calls_per_day: 400
    lookback_days: 30
    resources:
      - name: rainfactor_default_location
        type: default_factor
        data:
          watertarget: 0.5
      - name: rainfactor_with_custom
        type: custom
        data:
          formula: 'max( (0.5 - day0rain - day1rain/2 - day2rain/4 - day3rain/8 - day4rain/16) / 0.5, 0)'
      - name: 48hr_rain
        type: custom
        data:
          formula: day0rain + day1rain
      - name: rainfactor_aus_locatoin
        type: default_factor
        latitude:  -33.8302547
        longitude: 151.1516128
        data:
          watertarget: 0.5
      - name: rainfactor_with_custom_aus
        type: custom
        latitude:  -33.8302547
        longitude: 151.1516128
        data:
          formula: 'max( (0.5 - day0rain - day1rain/2 - day2rain/4 - day3rain/8 - day4rain/16) / 0.5, 0)'
      - name: 24hr_rain_aus
        latitude:  -33.8302547
        longitude: 151.1516128
        type: custom
        data:
          formula: day0rain

Maybe we should also lower the default API rate limit settings so that 2-3 locations can be supported without having to mess with including rate limits in the config?

petergridge commented 1 year ago

For the '3h' issue I would simply divide the value by 3 that should be accurate enough and given they provide 3 periods with the same data it logically makes sense.

Add the forecast as a new sensor makes sense, I was not planning to use it in my factor calculation but it opens up a lot of opportunities for the future

I think the first option for multiple sensors is best as it matches the way HA supports sensors.

petergridge commented 1 year ago

I can see that there is a lot happening, the calls are made regularly but, no sensor is created in HA. Are you seeing the same at your end?

tsbernar commented 1 year ago

The sensors are working on my end, could you share the config you’re using ?

petergridge commented 1 year ago

I copied your GIT repository, I can try downloading again.

I'm using the docker dev container and Visual Studio Code as my environment.

tsbernar commented 1 year ago

Oh I was talking about the config for your sensor so I can try to replicate on my end

petergridge commented 1 year ago

Ah, sorry, here is the yaml

sensor:
  - platform: openweathermaphistory
    name: 'rainfactor new'
    api_key: 6e5dd5b87a55018adee10ab2c7ed6f96
    v3_api: True
    lookback_days: 5
tsbernar commented 1 year ago

Got it, so you’ll need to add individual sensors under the resources list. (Borrowed the config naming from https://www.home-assistant.io/integrations/systemmonitor/)

The way it works now is you have one “platform” per lat/lon location, and then each “platform” can have multiple sensors under its “resources” list. Maybe we should just add the default sensor if none are specified to shrink down the minimal config?

Something like this should work to just give you the default sensor on your default location:

sensor:
  - platform: openweathermaphistory
    api_key: 6e5dd5b87a55018adee10ab2c7ed6f96
    lookback_days: 5
    resources:
      - name: new_rainfactor_sensor
        type: default_factor

Here’s a full example with 2 locations and multiple sensors at each

sensor: 
  - platform: openweathermaphistory
    api_key: 'key'
    max_api_calls_per_hour: 30
    max_api_calls_per_day: 200
    lookback_days: 30
    resources:
      - name: rainfactor_default_location
        type: default_factor
        data:
          watertarget: 0.5
      - name: rainfactor_with_custom
        type: custom
        data:
          formula: 'max( (0.5 - day0rain - day1rain/2 - day2rain/4 - day3rain/8 - day4rain/16) / 0.5, 0)'
      - name: 48hr_rain
        type: custom
        data:
          formula: day0rain + day1rain
  - platform: openweathermaphistory
    api_key: 'key'
    max_api_calls_per_hour: 30
    max_api_calls_per_day: 200
    lookback_days: 6
    latitude:  -33.8302547
    longitude: 151.1516128
    resources:
      - name: rainfactor_aus_locatoin
        type: default_factor
        data:
          watertarget: 0.5
      - name: rainfactor_with_custom_aus
        type: custom
        data:
          formula: 'max( (0.5 - day0rain - day1rain/2 - day2rain/4 - day3rain/8 - day4rain/16) / 0.5, 0)'
      - name: 24hr_rain_aus
        type: custom
        data:
          formula: day0rain
tsbernar commented 1 year ago

The reason for splitting it this way is to allow all the sensors at the same location to share the same set of data / api calls. Though we could also just achieve that on the backend if you think its more desirable to just have one “platform” configured and specify different locations on the sensor level in the resources list.

tsbernar commented 1 year ago

Responding to a few other comments:

For the '3h' issue I would simply divide the value by 3 that should be accurate enough and given they provide 3 periods with the same data it logically makes sense.

Makes sense to me; I've added this as well as a warning log message if we see anything else unexpected in there. Hopefully, "1h" and "3h" is all we'll see.

My preference is defaulting to 5 days of data to support the UI and calculation model limiting the start up load to only 120 calls for each sensor and then letting it build up naturally to a longer 30 day limit. Provide service to download additional days of history so advanced users can get data faster if required The end user can then be responsible for not overdoing the calls

I've made a change to the default API rate limits that should roughly accomplish this, though without a separate service. The default lookback is still 30 days, which is the maximum amount of data that we will keep in the rolling window and persistent store, but we will only backfill the first 5 shortly after startup. The way the backfilling works now is:

Every 30s SCAN_INTERVAL: 1) We check if our (30-day default) lookback window is full. If it's not full, we check if we have available API limits for the current hour and the current day; if we do, we will send off a background task to backfill up to 10 hours (or less if constrained by the API limits). 2) We check if we need to do a live update for the current hour. Step 1 always reserves enough limits so that we will be able to do the live updates once per hour.

The current limits are set to allow a backfill of 5 days in the first hour after a restart. In practice, this happens in the first 6 mins of the hour at a rate of 10 hours backfilled every 30s interval, then no backfilling for the rest of the hour until our initial requests roll off. The remaining 25 days of the full lookback window will then be slowly filled in over the next couple of days as daily and hourly limits permit. The default limits allow for up to 3 locations at a time without getting into paid API requests, assuming 0 persisted data at the start and all need a full backfill. If you already have a location configured, adding more should be okay, as the existing locations will only be using 24 requests per day once the backfills are complete.

Another option could be to have 2 lookback windows configured, a backfill window and a lookback window. The backfill window could be set to 5 days in your example, and the lookback set to 30. In this case we would only backfill the 5 days on startup (as permitted by the limits), but we will keep up to 30days of history as time passes and we naturally add more samples from live requests

petergridge commented 1 year ago

You have been busy, I like what you have done and I am learning something new from your coding, I still think in COBOL :)

Maybe we should just add the default sensor if none are specified to shrink down the minimal config?

That makes sense, I believe that having a default resource will make it more user friendly, less yaml = less mistakes and most users just run with default settings. we also need to consider the complexity that is needed to build into the config flow. If you are looking for an example config flow (all be it overly complex) the irrigation custom component in my repository has config flow.

I also like this option, if a user requests more than 5 days your existing rules will kick in.

Another option could be to have 2 lookback windows configured, a backfill window and a lookback window. The backfill window could be set to 5 days in your example, and the lookback set to 30. In this case we would only backfill the 5 days on startup (as permitted by the limits), but we will keep up to 30days of history as time passes and we naturally add more samples from live requests

I would consider getting the numeric value from the key and using it as the denominator, just to future proof it.

I've added this as well as a warning log message if we see anything else unexpected in there. Hopefully, "1h" and "3h" is all we'll see.

What are your plans to use the 30 days of data?

tsbernar commented 1 year ago

Nice, we're both learning here! This is the first HA integration I've worked on, and it's been much easier to see how it all works starting with an integration that already works than starting from scratch. (Just bought my first house, and have been a bit too excited about all the home automation things)

That makes sense, I believe that having a default resource will make it more user friendly, less yaml = less mistakes and most users just run with default settings. we also need to consider the complexity that is needed to build into the config flow. If you are looking for an example config flow (all be it overly complex) the irrigation custom component in my repository has config flow.

Agreed on the default. I was just starting to struggle with the config flow today, so will take a look at the irrigation component. I've been meaning to take a look at that anyway as irrigation automation is next up for me after getting this rain data. Do you have any other tips for irrigation generally? Using moisture sensors or anything else like that?

I would consider getting the numeric value from the key and using it as the denominator, just to future-proof it.

Makes sense to me. I was just worried that the division by the numeric value might not always work. I'm used to dealing with software where if something unexpected happens, you probably want to know about it right away and stop.. probably not our ideal behavior in this case, and there are other users to think about.

What are your plans to use the 30 days of data?

Mainly for UI, I have a vague idea of what I want a custom card to look like for displaying both irrigation time and rainfall over time, but I have yet to dig into the weeds of how hard that will be to make. I was thinking of using hourly data for recent days and a monthly view.

I have a template sensor that updates every 24 hours at midnight to capture that days details so the information is captured in HA History so I can present a graph. I can see this as one of the sensors types, the max temp, min temp, total rain and snow, average humidity value for a calendar day. This is a feature a lot of users have been looking for.

I think this should be straightforward with a custom type sensor after we expose humidity and temp as inputs to the formula, but also a good idea for a new sensor type with easier config.

Agreed on the stats, the HA history is great.. I just don't have enough history yet

image

I think a custom card that uses the internal state rather than HA history would give a lot of flexibility and allow us to display backdated data.

petergridge commented 1 year ago

I was just worried that the division by the numeric value might not always work.

As long as an error is handled and the control does not crash it should be fine, if not a valid value ignore it. But on that note I purposely exceeded the call limit to see how my version handles it, and I was thinking an error sensor type that provides the error details would be great, I can put on the dashboard with a condition to show only when it is active. Also the other sensors would benefit with a default value when they are in error so the irrigation system still gets a value, I could/should handle it at that end as well.

Do you have any other tips for irrigation generally? Using moisture sensors or anything else like that?

The irrigation control has had pretty good take up since I put it on HACS, it only took me 5 years before I got around to publishing it. I always get a rush of requests as the northern hemisphere watering season kicks in, every time I think I have all the bases covered someone has a good idea. I built it to be simple to configure and provide a functional UI capability that is not technical, since then I have built a card as well again functional rather than fancy. But this weather map history control has been downloaded over 600 time in the last couple of weeks since it was published.

I built 'rainfactor' because I got sick of fiddling with rain and moisture sensors. I built my own ESP based irrigation controller (the box it is in was more expensive than the components) with inputs to support sensors. I had issues with the sensor being in a rain shadow when the wind was blowing and it did not really help to determine how much rain there was. It could rain in the morning and my program runs in the afternoon... my list of grievance's is endless:). Even with moisture sensors it depends on where you place it, one in the lawn, one in pot plants the list goes on, and it was just fiddly so I went with a more the more global method that does not need hardware, that is what the internet is for after all, it has worked well for me.

I found it more reliable to use weather data to reduce watering based on rainfall, if a zone does not water it will check the next day and run if the conditions are met. If you have configured to run every 3 days and it does not water because of the 'rainfactor' it will still check every day until it does need to water, it does not wait another 3 days.

With the additional information and your formulas I can also increase the watering if the temperature is high or stop/reduce it if the temperature is low.

The other usage for your model is to build a template that I can use to alter the frequency of watering or even enable a second program to run if there is an extended or forecast period of hot weather, this will only need a small tweak to the irrigation control.

The work you have done will make this a much better partner for the irrigation control.

I think a custom card that uses the internal state rather than HA history would give a lot of flexibility and allow us to display backdated data.

From what I have seen (not that it is definitive), you can't access the backend data directly from the card, you need to access from the sensor and attributes, to go back 30 days you will need to expose a lot of attributes, this is not the way HA is heading. I exposed them this way as I was to lazy to create many sensors for people to get the information for their own calculations, but you have exposed the formula capability and multiple sensors so I think my attributes are no longer required.

Here is the graph I have now and the config I use, the mean option smooths out the graph to look more appealing. image I only have 30 days of history kept to keep the database snappy, I don't want to stress out the PI. Having said that I have a small SSD attached via USB3 and it is very good. I have an automation that runs weekly to clean up and compress the history.

Waiting 30 days was a bit of a pain when I started tracking but it was kinda nice to see the graph fill out over time. For me it was aesthetic rather than functional anyway 5 days was plenty for my purpose.

tsbernar commented 1 year ago

I still need to clean up my config flow code some more before I publish here, but it took me a while to get going so I wanted to give an update on how I think the config flow could work. Here's a screen record demo:

https://user-images.githubusercontent.com/11330651/236645872-1e88bf34-f400-409d-8e73-466b79a76983.mov

petergridge commented 1 year ago

The config flow is looking good, couple of things to consider:

I have taken your code (mostly) and put it into my latest checked in source. I have reworked:

tsbernar commented 1 year ago

Thanks. I've just pushed the code with the config flow.

-The same Lat Lon can only be used once across all instances of the integration. -Sensor names are unique across an instance -Validation added for each step. We validate that the API key is valid and that we can call the API, and that the inputs are valid for each sensor type. -I did not add the removal of the data file yet, it's quite small even with a longer lookback window, and it seemed more valuable for saving API calls from having the data persisted if you remove and re-add the integration, at least for testing this has been useful.

petergridge commented 1 year ago

Hi Trevor,

I just pushed out a version 2 of the component, I think it covers most of your requirements, after all I stole a whole lot of your work, thanks.

If you have time let me know what you think and we can add improvements from there.

Cheers Pete