jdemaeyer / brightsky

JSON API for DWD's open weather data.
https://brightsky.dev/
MIT License
287 stars 18 forks source link

Provide daily aggregates (min/max temperature, total sunshine, ...) #117

Open MarcelCoding opened 3 years ago

MarcelCoding commented 3 years ago

Add the ability to define the resolution for the /weather endpoint.

e.g. values HOURLY, DAILY, MONTHLY, YEARLY

jdemaeyer commented 3 years ago

Heyho @MarcelCoding, thank you for the feedback!

I have trouble imagining your use case - could you elaborate what response you would expect when supplying a resolution other than hourly? This is a weather record as the /weather endpoint currently returns it:

    {
      "timestamp": "2020-04-21T10:00:00+00:00",
      "source_id": 7003,
      "precipitation": 0,
      "pressure_msl": 1021.5,
      "sunshine": 60,
      "temperature": 14.6,
      "wind_direction": 80,
      "wind_speed": 21.6,
      "cloud_cover": 0,
      "dew_point": -0.7,
      "relative_humidity": 35,
      "visibility": 40000,
      "wind_gust_direction": 80,
      "wind_gust_speed": 49.3,
      "condition": "dry",
      "icon": "clear-day"
    }

Most of these weather parameters are momentary measurements: at exactly 10:00 UTC, the air temperature was 14.6 °C, the mean sea level pressure was 1021.5 mbar, etc. (The exceptions being the precipitation, wind, and sunshine parameters, which are either averages, totals, or extrema of the preceding hour.)

If the user specifies a daily resolution, what temperature value would you expect? The temperature at 0:00 UTC? Or maybe the temperature field should be split into min_temperature, max_temperature, avg_temperate?

MarcelCoding commented 3 years ago

I see. It would be difficult to do an efficient average, min and max? My initial idea wars that the api is providing the average of (if you select daily) the selected day.

The background is, if you query data from a lager time span. You get a huge data set and the api is always slow. If I select a big time span I also don't want hourly data, because the diagram or whatever isn't able to display the data.

jdemaeyer commented 3 years ago

I see. It would be difficult to do an efficient average, min and max?

It wouldn't be difficult per se, and it may even come with the latency improvement you're hoping for if we do the aggregation in SQL, but it seems tough to accommodate the different use cases into the existing response format. Maybe a second endpoint with a different response structure (containing min_temperature, avg_temperature, etc.) would be a better fit.

Alternatively we could go the Dark Sky route and return a daily summary alongside the hourly data (again containing fields like min_temperature and avg_temperature). That won't help with the amount of data returned by the API but it would free the users from having to do their own aggregations for the "diagram" use case you describe.

That being said, a summary feature doesn't strike me as particularly much-wanted, and users should be able to implement basic aggregations like maximum and average without too much trouble, so I'd like to gauge some user interest before investing development time into it. (I.e. my own development time, I am of course very open to pull requests! :))

MarcelCoding commented 3 years ago

That's sound great. With the problem of the efficiently I meant that you calculate the min a and max every time e.g. in a sql query.

Unfortunately I am not aber to provide you a pull request because I am not cabable to write python code.

tawissus commented 2 years ago

Hello,

I have a similar request. I want my weather station to display the best icon for the day. The API does not yet provide an icon for a day, but it does for the hours and I don't like use a selected time like 12:00.

Currently I am trying to group the weather icons for all 24 hours of the day. The highest value could then be displayed. But this brings problems, if e.g. two icons have the same number of hours.

e.g.

a) Good example 10h clear-day -> best icon 8h clear-night 6h rain

b) Example with problem 10h clear-day -> same 10h clear-night -> same 4h rain

Do you have an idea? Or is something already in preparation? e.g. min-max for one day I can easily calculate myself.

Greetings

For example my next days:

Tag0 Array ( [clear-night] => 4 [clear-day] => 11 [partly-cloudy-day] => 4 [partly-cloudy-night] => 5 ) Tag1 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 14 [cloudy] => 1 ) Tag2 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 15 ) Tag3 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 15 ) Tag4 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 15 ) Tag5 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 15 ) Tag6 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 15 ) Tag7 Array ( [partly-cloudy-night] => 9 [partly-cloudy-day] => 15 ) )

jdemaeyer commented 2 years ago

Hi @tawissus,

I fear the question for a daily icon has much more to do with personal preference than with any "proper" way to determine it.

My personal first step would be to get rid of the -night/-day duplication, either by

Next I would prioritize bad-weather icons. A day that has four hours of rain and is cloudy-but-dry otherwise definitely deserves a rain icon and not a cloudy icon, in my opinion. This is probably easiest to do with some prioritized thresholds (e.g. "if there's two or more thunderstorm icons, use thunderstorm, otherwise if there's four or more rain icons, use rain, otherwise use the most-occurring icon during daytime hours"). The way Bright Sky currently determines the hourly icons is already quite opinionated in this fashion.

That being said, adding a daily summary (which if added would definitely include the icon field) is currently not the highest item on my priority list, so there's nothing in preparation at the moment. :(