jdejaegh / irm-kmi-ha

Home Assistant weather provider using data from Belgian IRM KMI 🇧🇪 🇱🇺 🇳🇱
MIT License
46 stars 0 forks source link

Confidence intervals get_forecasts_radar #37

Closed TimFranken closed 4 months ago

TimFranken commented 4 months ago

First of all, thanks for the excellent integration!

Describe the solution you'd like Would it be possible to integrate the confidence intervals in the service irm_kmi.get_forecasts_radar. From the API data it seems that it should be possible to derive them from the positionLower and positionHigher (in combo with position and value). But as I don't have any info on the API I'm not really sure.

jdejaegh commented 4 months ago

Thanks for the feedback!

But as I don't have any info on the API I'm not really sure.

I don't have more info than you do, I just spent quite some time looking at the data it returns and how it is shown in their app.

However, what you suggest makes sense to me. For each timestamp returned by the service irm_kmi.get_forecasts_radar, the API provides the following data:

{
  "time": "2024-05-30T18:00:00+02:00",
  "uri": "https://somewhere.tld/image.png",
  "value": 0.89,
  "position": 0.533,
  "positionLower": 0.299,
  "positionHigher": 0.671
}

I only use time and value (in mm/10min) in the service to provide the response. The values for position, positionLower and positionHigher don't have a unit: they use it in their Android app to draw the animation.

I could use a cross-multiplication to compute the confidence interval as follows:

ratio = value / position
upper_bound = positionHigher * ratio
lower_bound = positionLower * ratio

This would give a confidence interval of 0.499 - 1.120 (mm/10min) for the example above.

Does that make sense to you?

A few things to keep in mind when implementing that:

TimFranken commented 4 months ago

Thanks for the quick reply. When I checked this morning it seemed that the ratio (value / position) is constant for all elements in a sequence:

            {
                "time": "2024-05-30T07:00:00+02:00",
                "value": 0.18,
                "position": 0.6, // ratio = 0.3
                "positionLower": 0.6,
                "positionHigher": 0.6
            },
            {
                "time": "2024-05-30T07:10:00+02:00",
                "value": 0.11,
                "position": 0.367,  // ratio = 0.3
                "positionLower": 0.367,
                "positionHigher": 0.367
            },
            {
                "time": "2024-05-30T07:20:00+02:00",
                "value": 0.03,
                "position": 0.1, // ratio = 0.3
                "positionLower": 0.1,
                "positionHigher": 0.1
            },

So as long as there is at least one timestamp with rainfall (= position and value) this conversion should work perfectly.

But indeed it becomes tricky when there is no rainfall in the forecasts but positionHigher is positive for one timestamp. Personally I still find the information we can derive valuable (=there is a chance of rain although we don't know exactly how much). You could work with a default value for the ratio and add a warning but it is certainly not the ideal solution and might lead to confusion.

jdejaegh commented 4 months ago

When I checked this morning it seemed that the ratio (value / position) is constant for all elements in a sequence

I came to the same conclusion when writing my comment. The value was 1.67 for the ratio

But indeed it becomes tricky when there is no rainfall in the forecasts but positionHigher is positive for one timestamp.

Having a boolean that is true whenever positionHigher > 0 fits the use-case you describe (we just know it might rain) and does not involve adding a magic number.

Another option is to set the upper forecast bound to None (instead of zero or something else) to carry that meaning: it's not zero but we don't know how much it is.
I prefer the explicit boolean over the None that may look like a bug.

I'll look into adding this feature

TimFranken commented 4 months ago

Yes the boolean might be the cleanest option which will fit the use-case. It would be more interesting to have the actual rainfall values but I also don't see how to do that properly so that it always works given the current available data. I would avoid the None as it might get tricky in further processing. An unrealistic value e.g. -1 might be a better option but also that is not perfect and might lead to more questions.

jdejaegh commented 4 months ago

This is implemented and released with version 0.2.15. Other changes that require Home Assistant 2024.6 were released in the same version so it will only work with Home Assistant 2024.6 or newer