meteostat / meteostat-python

Access and analyze historical weather and climate data with Python.
https://dev.meteostat.net/python/
MIT License
428 stars 60 forks source link

Non-model data is not up-to-date #117

Open jragbeer opened 1 year ago

jragbeer commented 1 year ago

Hello! This project is amazing and very convenient to use! Thanks for the great work!

I'm trying to integrate this into our project, but the timing seems to be off.

As an example, the data from Environment Canada doesn't seem to be the latest that it could be. Looking specifically at Toronto City Centre (https://meteostat.net/en/station/2XUGG?t=2022-12-11/2022-12-13). I see that non-model data is present up to the start of 12/12/2022 only, while data from the source is available up to the beginning of 12/13/2022.

When do sources get updated? Would it be possible for meteostat to be updated more frequently? I use the python client and would much prefer using this library than going back to using Environment Canada's API.

clampr commented 1 year ago

Hi @jragbeer,

Yes - it's possible to update data more frequently. I think we need to check when exactly our data sources are updating their datasets and then import all data at once.

I will put it as a to do. Can't give you an ETA though.

clampr commented 1 year ago

Hello everyone,

Meteostat just received an E-Mail from Environment Canada:

Please note that one of your clients, with the IP address XX.XXX.XXX.XXX, has been hitting our site frequently and heavily. The traffic amounts to 327,568 requests during the first week of December.

Please let your client to know that the Historical Climate Data Website is not to be used as a real-time data source, hence requests to these services ought only be done daily (at max). For more frequent requests, we can propose to client alternate sites, depending of your needs.

This is a setback for our data coverage in Canada and as fas as I know there is no public bulk data interface provided by Environment Canada.

Anyone who can help finding a work-around?

jragbeer commented 1 year ago

~47k requests per day is a lot. Especially given the above issue where data isn't up-to-date.... Is it possible to spread the data gathering across other instances/IPs (acquire cheap VMs from Linode/DO to request from Environment Canada)?

clampr commented 1 year ago

I was hoping they might have an interface which allows data to be queried in bulk. Maybe this is better than what we're using atm?

https://catalogue.ec.gc.ca/geonetwork/srv/eng/catalog.search#/metadata/e50a9544-eee2-460c-a8b1-1a92a487d060

Edit: https://dd.weather.gc.ca/