earthobservations / wetterdienst

Open weather data for humans.
https://wetterdienst.readthedocs.io/
MIT License
351 stars 54 forks source link

Feat: Add integrated surface database #871

Open marvingabler opened 1 year ago

marvingabler commented 1 year ago

I was wondering if there are plans to integrate the ISD data into Wetterdienst yet. It is freely accessible and provides data on a global scale like the GHCN on an hourly basis. The data is also hosted on s3, which is available in raw format and in csv. If the csv's are converted valid (haven't checked yet), this could be a quick integration. For the raw format, there are some parsers (e.g. this one) available.

Dataset: https://www.ncei.noaa.gov/products/land-based-station/integrated-surface-database

Extend: Global

Temporal resolution: Hourly

Publish delay: 1-2 days

Checking in after some time, it is great to see how fast this package is evolving!

gutzbenj commented 1 year ago

Dear @marvingabler ,

thanks for the hint (and also the kind words)! I'll check out the data on the weekend and see if we can get it into wetterdienst quickly.

gutzbenj commented 1 year ago

I just looked at the ish_parser and I think we'll have to add some adaptations there to get it nicely done. Things to consider:

marvingabler commented 1 year ago

Good points! We will be in need of the isd data probably in a few weeks, happy to pick that up if it's still open then

amotl commented 1 year ago

Dear Marvin,

thanks for your suggestion.

For the raw format, there are some parsers (e.g. ish_parser) available.

@gutzbenj said:

Currently in the issues it was discussed to make the parser faster but ideas weren't yet applied.

The issue @gutzbenj is referring to, is https://github.com/haydenth/ish_parser/issues/20 by @vtoupet. On the matter of vectorized processing, I discovered pyisd by @gadomski, which is based on pandas -- cheers! I did not evaluate it yet, but one of us should do, and report back.

Not sure if isd-s3 helps, I believe the acquisition part will eventually be implemented by Wetterdienst.

With kind regards, Andreas.

amotl commented 1 year ago

On the other hand, maybe @gadomski knows of any public sources which make ISD data available in other formats or through modern technologies like STAC or Zarr?

gadomski commented 1 year ago

Tl;dr: no, sorry

We were working on including it in the Planetary Computer but that work never made it over the finish line. I could build it into partitioned geoparquet (you can see some WIP notebook scribbles here: https://github.com/gadomski/chalkboard/blob/main/notebooks/isd-demo.ipynb), which worked ok, but keeping those partitions up-to-date with new data proved to be a bigger lift than it was worth, at the time.

marvingabler commented 1 year ago

Thats a good point! Are you aware of any such sources that are providing open weather observations in such an easy to consume format (like meteostat but with STAC)? We recently had a chat at Jua regarding how awesome it would be if all the open weather data would be available in STAC. There are plenty of data sources that are not yet easily accessable via EE/Planetary Computer.

gadomski commented 1 year ago

https://stacindex.org is a decent reference for available STAC catalogs and tools, so you could look there. Also, the STAC Gitter can be a good place to ask as well. I personally don't know of much, but I'm not especially tuned in to the open weather community -- I'm more from the software+STAC world.

This is what's currently in the weather+climate tag on the Planetary Computer: https://planetarycomputer.microsoft.com/catalog#Climate/Weather

amotl commented 1 year ago

@marvingabler: You mean like Open-Meteo's Historical Weather API (blog article), but actually based on API/format standards, and not limited to ERA5?

-- https://github.com/open-meteo

gutzbenj commented 9 months ago

Dear @gadomski ,

I'm currently considering integrating NOAA ISD into wetterdienst using your excellent work at pyisd. On that behalf would you be open taking in PRs to polish the library a bit using tools like poetry etc. and reconsidering/restructuring parts of the library?

Sincerely, Benjamin

gadomski commented 9 months ago

On that behalf would you be open taking in PRs to polish the library a bit using tools like poetry etc. and reconsidering/restructuring parts of the library?

PR's always welcome. With respect to switching to poetry specifically, I'd be interested to see a justification -- I generally don't use poetry as I don't find that I need it. But any bug fixes or feature improvements would be quite welcome.

gutzbenj commented 6 months ago

Dear @marvingabler ,

ISD finally made it into wetterdienst under the hood of NOAA GHCN-h /GHCN hourly [1] which has been released earlier this year in a first version and assembles the exact same data but way more conveniently accessible. Please give it a try! I just went through it only once got get a fast release but everything should be working for now :)

[1] https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C01688

gutzbenj commented 6 months ago

Just figured that it currently only covers US stations -.- and currently there's an issue with the date parsing.