earthobservations / wetterdienst

Open weather data for humans.
https://wetterdienst.readthedocs.io/
MIT License
342 stars 52 forks source link

MetaFileNotFound with documentation example #678

Closed guidocioni closed 1 year ago

guidocioni commented 2 years ago

Hey guys, after not having used wetterdienst for a while as usual I'm struggling to adapt the code that used to work with the latest version. However, I cannot even run the example on the doc, e.g.

request = DwdObservationRequest(
    parameter=[DwdObservationDataset.CLIMATE_SUMMARY],
    resolution=DwdObservationResolution.DAILY,
    start_date="1990-01-01",
    end_date="2020-01-01",
).filter_by_station_id(station_id=[3, 1048])

MetaFileNotFound: No meta file was found amongst the files at https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/daily/kl/historical/.

Is this a temporary error or am I doing something wrong?

gutzbenj commented 2 years ago

Dear @guidocioni ,

this seems to be related to the cache... When I run the code snippet I get the expected result - a dataframe with two stations. However I wonder what could cause the file index cache to return no data instead of old data.

Cheers Benjamin

guidocioni commented 2 years ago

I seem to get the cache issues over and over again :D Ok, let me try again.

tenitz commented 1 year ago

I get the same error, what was the solution?

guidocioni commented 1 year ago

I don't know for sure as it disappeared but I'm pretty sure it was related to the cache.

If you're on mac try to delete ~/Library/Caches/wetterdienst/ and restart Python

guidocioni commented 1 year ago

Hey @gutzbenj, I started having the same issue again today :( but the cache is disabled... any clue what could be causing it?

gutzbenj commented 1 year ago

Dear @guidocioni ,

I will have a look into this in the evening. Hopefully it can be resolved quickly.

gutzbenj commented 1 year ago

Dear @guidocioni ,

I have tried your code with the following snippet and it didn't throw any error

from wetterdienst import Settings
from wetterdienst.provider.dwd.observation import DwdObservationRequest, DwdObservationDataset, DwdObservationResolution

Settings.cache_disable = True

request = DwdObservationRequest(
    parameter=[DwdObservationDataset.CLIMATE_SUMMARY],
    resolution=DwdObservationResolution.DAILY,
    start_date="1990-01-01",
    end_date="2020-01-01",
).filter_by_station_id(station_id=[3, 1048])

print(request.df)

print(request.values.all().df)

Could you try once again?

Cheers Benjamin

guidocioni commented 1 year ago

Unfortunately the problem is still there. Here is the code that I'm using

from wetterdienst.provider.dwd.observation import DwdObservationRequest
from wetterdienst import Settings
Settings.tidy = False
Settings.humanize = True
Settings.si_units = True  

observations = DwdObservationRequest(
    parameter=["climate_summary"],
    resolution="daily",
)

df = observations.filter_by_station_id(station_id=1975).values.all().df
df = df.set_index('date')

I'm wondering if it has to do with the fact that I'm under a proxy. Does wetterdienst use requests under the hood to download files and does it honor the http_proxy env. variables?

gutzbenj commented 1 year ago

Dear @guidocioni ,

we use fsspec for file acquisition which relies on aiohttp specifically. Errors within aiohttp are swallowed within fsspec however which makes it hard to debug.

At #524 a user had a similar problem with a proxy. We added a settings object FSSPEC_CLIENT_KWARGS to pass arguments to the underlying library. Aiohttp uses HTTP_PROXY, HTTPS_PROXY, WS_PROXY, WSS_PROXY [1]. Something like the following should be possible:

import os
from wetterdienst.util.cache import FSSPEC_CLIENT_KWARGS

# Set proxy
os.environ["HTTP_PROXY"] = http://proxy.com"

# Allow fsspec to use environmental variables as the one defined above
FSSPEC_CLIENT_KWARGS["trust_env"] = True

# your code here
request = ...

Could you try this please and report back?

Cheers Benjamin

[1] https://docs.aiohttp.org/en/stable/client_advanced.html?highlight=proxy

gutzbenj commented 1 year ago

On a self-note: The setting should be moved to our global Settings object.

guidocioni commented 1 year ago

That didn't fix it unfortunately. I'm trying to uninstall and re-install it. I don't have this problem on another machine

gutzbenj commented 1 year ago

Dear @guidocioni , how did it proceed?

guidocioni commented 1 year ago

sorry for the late reply..I tried everything including changing proxy, reinstalling everything, but somehow couldn't get it to work on that machine. I just abandoned it 😃 F

gutzbenj commented 1 year ago

Closing for now. If it happens again, consider to reopen!

guidocioni commented 1 year ago

Sorry to open this again but I just can not get wetterdienst to work on this machine and it is driving me crazy :)

MetaFileNotFound: No meta file was found amongst the files at https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/solar/.

I can do a wget on https://opendata.dwd.de/climate_environment/CDC/observations_germany/climate/hourly/solar/. so it shouldn't be a proxy/certificate issue as in https://github.com/earthobservations/wetterdienst/issues/827.

Tried with or without cache (even though ~/Library/Caches/Wetterdienst never gets created), with the newest version of wetterdienst and with all the options already suggested here.

gutzbenj commented 1 year ago

i have the strong feeling we need a one on one session with both of us staring at your screen and waiting for your pc to ask for forgiveness!

But wget works independent of Python so there could still be an issue with the linkage between Python and your certificates, right?

Why does ~/Library/Caches/Wetterdienst never get created?

gutzbenj commented 1 year ago

Any updates on this @guidocioni ?

guidocioni commented 1 year ago

Hey @gutzbenj. I really don't know what to try next. The problem only surfaces on my work laptop (which has some weird network settings because of corporate policy....) but it doesn't really depend on the network I'm connected to. Still, it would be good to make it work because every time I want to make a small analysis I have to wait to be home again... What do you think I could try next?

amotl commented 1 year ago

Hi there,

maybe it would work by using the corporate HTTP proxy?

With kind regards, Andreas.

On 11 July 2023 08:53:18 CEST, Guido Cioni @.***> wrote:

Hey @gutzbenj. I really don't know what to try next. The problem only surfaces on my work laptop (which has some weird network settings because of corporate policy....) but it doesn't really depend on the network I'm connected to. Still, it would be good to make it work because every time I want to make a small analysis I have to wait to be home again... What do you think I could try next?

-- Reply to this email directly or view it on GitHub: https://github.com/earthobservations/wetterdienst/issues/678#issuecomment-1630250444 You are receiving this because you are subscribed to this thread.

Message ID: @.***> -- Sent from my mind. This might have been typed on a mobile device, so please excuse my brevity.

guidocioni commented 1 year ago

I think we explored this already. The proxy address/authentication is defined in HTTP_PROXY,HTTPS_PROXY,http_proxy,https_proxy variables and thus automatically used by all the libraries that make requests.

I can succesfully use requests from Python to the external world.

amotl commented 1 year ago

Oh all right. I probably forgot that we tested all that already.

On 11 July 2023 09:47:13 CEST, Guido Cioni @.***> wrote:

I think we explored this already. The proxy address/authentication is defined in HTTP_PROXY,HTTPS_PROXY,http_proxy,https_proxy variables and thus automatically used by all the libraries that make requests.

I can succesfully use requests from Python to the external world.

-- Reply to this email directly or view it on GitHub: https://github.com/earthobservations/wetterdienst/issues/678#issuecomment-1630321835 You are receiving this because you commented.

Message ID: @.***> -- Sent from my mind. This might have been typed on a mobile device, so please excuse my brevity.

gutzbenj commented 1 year ago

You may try

settings = Settings(fsspec_client_kwargs={"trust_env": True})

for your request to accept any proxy your sitting on.

guidocioni commented 1 year ago

You may try

settings = Settings(fsspec_client_kwargs={"trust_env": True})

for your request to accept any proxy your sitting on.

Yeah, I just realized this is the way to pass the proxy vars to aiohttp and we already explored this in the past... Let me try again. Right now I'm fighting against conda to install something newer than 0.49....

guidocioni commented 1 year ago

Guys, I'm getting crazy....I tried to use the HTTPFileSystem component alone and it's still returning an empty list..

fs = HTTPFileSystem(
        use_listings_cache=True,
        listings_expiry_time=not True and CacheExpiry.METAINDEX.value,
        listings_cache_type="filedircache",
        listings_cache_location=Settings.default().cache_dir,
        client_kwargs={"trust_env": True},
    )

fs.find(url)

I'm still not sure if aioHttp is then REALLY using the Proxy. Any way to check?

guidocioni commented 1 year ago

Moving forward...I was able to get the error by using directly aiohttp

async with aiohttp.ClientSession(trust_env=True) as session:
    async with session.get("http://python.org",) as resp:
        print(resp.status)

>> SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:1002)

So it is certificate validation after all... Indeed this

session.get("http://python.org", verify_ssl=False)

works!

Unfortunately we only pass client_kwargs to aiohttp.ClientSession() so there's no way to pass something to the session.get construct... any idea on how to disable SSL validation alltogether?

guidocioni commented 1 year ago

Oh, my God. I think (fingers crossed) to have solved it.

Instead of disabling SSL verification I used the environment variable SSL_CERT_FILE to specify the location of my certificate, and this seems to have worked. According to many articles it shouldn't have an effect in aiohttp but at least for me that seemed to have solved it.

This means I don't have to specify fsspec_client_kwargs or anything new in my code but instead just add SSL_CERT_FILE.

Korbenga commented 5 months ago

Guys, I think I have the same issue on a mac and I'm lacking some experience or knowledge to implement @guidocioni's solution. I added SSL_CERT_FILE environment variable but what value/path should I choose (to which certificate file)? I also tried to go to settings and select "Accept non-trusted certificates automatically", it didn't help.

guidocioni commented 5 months ago

Guys, I think I have the same issue on a mac and I'm lacking some experience or knowledge to implement @guidocioni's solution. I added SSL_CERT_FILE environment variable but what value/path should I choose (to which certificate file)? I also tried to go to settings and select "Accept non-trusted certificates automatically", it didn't help.

If you're in a corporate network you should have received a certificate from your company to make SSL requests. you should use this in the path

saijithendr commented 4 months ago

The following results.query() is not returning any data

    results = DwdRadarValues(
        parameter=DwdRadarParameter.RADOLAN_CDC,
        resolution=DwdRadarResolution.HOURLY,
        period=DwdRadarPeriod.HISTORICAL,
        start_date="2018-01-01",
        end_date="2020-01-01",
    )

Is there any other way where i can get the historical radolan data ?

Thank you

gutzbenj commented 4 months ago

Thanks for reporting, will look into this tomorrow.