Pirate-Weather / pirateweather

Code and documentation for the Pirate Weather API
Apache License 2.0
643 stars 29 forks source link

Improve text summaries and descriptions #48

Open emilyboda opened 1 year ago

emilyboda commented 1 year ago

Hey, awesome project!

I loved Dark Sky because of their "summary" field. The summaries were usually pretty descriptive, something like "Rain (with a chance of 2-4 in. of snow) starting in the afternoon." Looking at the next seven days forecast, the summaries are more like something I can get from other APIs. The one for the next few days in my area just says "Clear", "Cloudy", or "Rain". Is there a plan to make this summary more descriptive?

Please ignore if the summary fields are identical and I've just not seen any descriptive ones yet!

oisact commented 1 year ago

For the minutely summary it's even more limited. Basically it appears to say just "Clear" if there is no chance of precipitation. It can be foggy, cloudy, totally overcast, etc, and the term "Clear" is being used to simply indicate it is not precipitating. So in other words it's not useful at all.

I have switched over the WeatherKit - the "formal" replacement for DarkSky (after paying $25 to PirateWeather to evaluate it) as I get 500k calls a month with my existing developer account. However I am seeing the exact same thing there, although at least they don't refer to that field as a "summary", in the conditionCode field. It is just an enumeration of about 20 different values indicating the current condition, but at least it includes cloud cover and the like.

cloneofghosts commented 1 year ago

The summaries are next up on the roadmap and according to the dev on reddit it should be done in a couple of months

For the minutely summary it's even more limited. Basically it appears to say just "Clear" if there is no chance of precipitation. It can be foggy, cloudy, totally overcast, etc, and the term "Clear" is being used to simply indicate it is not precipitating. So in other words it's not useful at all.

It's been a while since I used the DarkSky API so I can't recall if the minutely summary was just used for precipitation or if it displayed conditions at all. I'm also pretty sure it didn't display an icon either but again I cannot remember. If PW is just using it for precipitation the next should change from Clear to No Precipitation to make things clearer. @alexander0042

oisact commented 1 year ago

Minutely was extremely descriptive. It would say things like "Cloudy for the hour", "Light rain beginning in 10 minutes, lasting for 30 minutes", "Snow for the next 40 minutes". I totally relied on the Minutely and current day forecast's summary to show a text representation of the next 60 minutes, and alternating with the day's forecast. The text was always written in such a manner that it, totally by itself, was descriptive of both the time frame and condition. IE it did not require any prompt on my behalf for the user to understand that text and all I had to do was simply alternate my UI back and forth between the two text summaries (again, minutely summary and the current day's forecast summary) and it was totally clear what the next 60 minutes would be, and the rest of the day.

Currently I'm manually appending "for the hour" and "for the day" (changing text after 7 PM to "night"), but it's apparent I'm going to have to write my own minutely data to text routine that can produce a similar text summary output. The problem with that is trying to distill it down to just one or two "events" over the next hour so it isn't some ridiculously long play-by-play text. IE defining relative thresholds for precipitation levels that warrant being mentioned in the text (a bad example, "heavy rain possibly starting in 10 minutes, lasting for 20 minutes, possibly changing to light rain for 15 minutes, then intermittent drizzle for the rest of the hour"). Somehow it needs to be distilled down to something more like "possible heavy rain starting in 10 minutes, followed by intermittent rain and drizzle for the rest of the hour".

Anyway I'm rambling on, but what is needed is a good open source algorithm / code that can take a minutely data set and generate a summary text. You've already linked, I believe, to DarkSky's repo that has translations of the verbiage that can be used. But what is missing is the routine that converts the numeric data into the data sets, IE:

The data passed from the Dark Sky API to this translation module is a simple, structured format reminiscent of s-expressions, consisting only of numbers, strings, and arrays. Some examples produced by the API are below:

"heavy-rain" ["and", "light-wind", "light-clouds"] ["starting-in", "very-light-rain", ["minutes", 15]]

https://github.com/darkskyapp/translations

cloneofghosts commented 1 year ago

Yes, I am aware that the minutely summary included precipitation since DarkSky would show it on their website when precipitation was forecasted in the next 60 minutes. Outside of that I never paid too much attention to the minutely summary but from what you said it included non-precipitation descriptions as well.

The repo that I linked to is what the dev plans to use to get the text summaries working. I'm not sure if the code to determine what summary is used will be included in that repository or not.

Maybe getting the minutely summaries working would be a good starting point? I'll tag @alexander0042 again in case the last one didn't work.

psagat commented 1 year ago

Right now the hourly is pretty much useless as it will say clear but merrysky will show cloudy. I'm not sure why merrysky is accurate but using the API isn't?

cloneofghosts commented 1 year ago

@psagat Approximately where are you located? Be aware that MerrySky caches their forecasts (#30) so the data that the API is returning is the most up-to-date data. There's been quite a few issues created about the accuracy of the forecasts/current conditions and is on the roadmap to be improved.

psagat commented 1 year ago

Zip code 53066. So perfect example, it's raining currently 11:29 cst, area wide yet the api says cloudy for current conditions but merry sky says raining and the radar shows rain. Next hour shows clear... when it's clearly going to rain for and storm for several hours. Not sure how to reconcile that.

cloneofghosts commented 1 year ago

@psagat When I look this morning I see that the hourly dats is showing precipitation but the minutely and currently are not. In the API documentation the dev states:

For currently and minutely forecast blocks, the HRRR "Precipitation Rate" variable is used where available, otherwise averaged GEFS data is returned. For hourly and daily forecast blocks, GEFS is always used. This is done so that the precipIntensityProbablity variable is aligned with the intensity.

PirateWeather does not integrate any radar or satellite data in its forecasts though there is this suggestion #10 to possibly add some in. There are plans to add in a different weather model which should help to improve the issues you are seeing with precipitation but there is no ETA on when it will be added.

What you're seeing on MerrySky is probably the hourly data being used as current conditions which is why there is a difference between the API and the website.

psagat commented 1 year ago

Yeah even right now its doing it again, the api call shows cloudy for currently, and rain in the next hour, but its clearly raining and has been for awhile now. I go over to openweather and it shows currently rain with a description of light rain, which is accurate. Something isnt right with the data. I love that this was a drop in for darksky but I havent it found the data to be very accurate at all in regards to anything but cloudy or clear.

cloneofghosts commented 1 year ago

There could be one of two different thing going on

  1. HRRR (the weather model used in the US) isn't forecasting any precipitation.
  2. There is precipitation but not enough to show in the API. There seems to be a bug where precipitation intensities under 0.44 mm/h (0.01 in/h) don't show in the minutely or currently section in the HRRR domain but is fine elsewhere.

The fix for this is to add more sources of data but it isn't the highest priority currently.

From looking at the models using Pivotal Weather I think it's the first one but I don't have access to the back-end to confirm. I'd recommend creating a new issue if you'd like to continue the discussion since we're getting off-topic from the main concern of this issue.

github-actions[bot] commented 1 year ago

There has not been any activity on this issue in the last ninety and will automatically close in seven days. Comment on this issue to prevent this issue from closing automatically.

cloneofghosts commented 1 year ago

This is still second in the priority list correct? The roadmap has it listed second behind keeping the forecast initialization data but not sure if that has changed at all.

There are also similar issues in the HA repository https://github.com/alexander0042/pirate-weather-ha/issues/94 and https://github.com/alexander0042/pirate-weather-ha/issues/100 and I'm not sure we need three separate issues open. Maybe rename this one to encompass both of those issues?

alexander0042 commented 1 year ago

Still priority 2! I've tagged it accordingly, and I like the idea of renaming this one to clarify it

cloneofghosts commented 1 year ago

I know you mentioned about possibly using AI for weather descriptions and I found out that AerisWeather has an AI summary when you view their forecasts https://www.aerisweather.com/weather/local/ca/on/ottawa

Not sure if it's output in the API or just runs in the browser.

alexander0042 commented 9 months ago

Oh that's a cool catch, and a pretty cool feature. I bet they're generating it ahead of time for a discrete list of points, since it wouldn't be that crazy to do.

As much as it's a cooler approach, the real perk to using a more structured approach is that I can use the Dark Sky translations, which would be super handy

cloneofghosts commented 9 months ago

Oh that's a cool catch, and a pretty cool feature. I bet they're generating it ahead of time for a discrete list of points, since it wouldn't be that crazy to do.

Yeah, they're probably feeding it specific data points and then saving the output somewhere as you get the same summary when you refresh the page.

As much as it's a cooler approach, the real perk to using a more structured approach is that I can use the Dark Sky translations, which would be super handy

Plus the translation library already has existing translations is pretty nice. Though I was thinking of some ideas that would be cool to implement into that library but I'm not sure what is possible. I know some sites use phrases such as: Risk of a thunderstorm this evening or Risk of Freezing Rain or Snow transitioning to Rain this afternoon etc. I bet some of the sites (Accu/The Weather Channel/Environment Canada) have actual meteorologists to write those kinds of forecasts.