earthobservations / wetterdienst

Open weather data for humans.
https://wetterdienst.readthedocs.io/
MIT License
349 stars 54 forks source link

Requesting data by "nearest stations" fails with resolution "10_minutes" #236

Closed amotl closed 3 years ago

amotl commented 3 years ago

Describe the bug When trying to request data using the "nearest stations" search, Wetterdienst croaks when using the "10_minutes" resolution. Apparently, other resolutions work. Thanks for reporting this, @wetterfrosch!

To reproduce

wetterdienst dwd readings \
    --parameter=precipitation --resolution=10_minutes --period=recent \
    --latitude=50 --longitude=10 --distance=100

Full traceback

Traceback (most recent call last):
  File "/Users/amo/Library/Caches/pypoetry/virtualenvs/wetterdienst-EkOFQaO8-py3.8/bin/wetterdienst", line 33, in <module>
    sys.exit(load_entry_point('wetterdienst', 'console_scripts', 'wetterdienst')())
  File "/Users/amo/dev/earthobservations/wetterdienst/wetterdienst/cli.py", line 234, in run
    df = get_nearby(options)
  File "/Users/amo/dev/earthobservations/wetterdienst/wetterdienst/cli.py", line 341, in get_nearby
    nearby_stations = DWDObservationSites(
  File "/Users/amo/dev/earthobservations/wetterdienst/wetterdienst/core/sites.py", line 156, in nearby_radius
    all_nearby_stations = self.nearby_number(latitude, longitude, metadata.shape[0])
  File "/Users/amo/dev/earthobservations/wetterdienst/wetterdienst/core/sites.py", line 93, in nearby_number
    raise ValueError("'num_stations_nearby' has to be at least 1.")
ValueError: 'num_stations_nearby' has to be at least 1.
gutzbenj commented 3 years ago

Dear @amotl , as you may have seen, for the nearby_radius function, I first called nearby_number with all available stations (given by the shape of the metadata dataframe). This was done to use ecactly the same method to get an already ordered dataframe with distances which only have to be filtered by radius afterwards. In the case you are describing it seems that the metadata dataframe is empty. I'm wondering why this is so, as for my local version I could not replicate this bug. Can you check the output of self.all() for your case?

Maybe it is related to some modifications that were done to the original metadata dataframe? For this case we should rather return a copy of the dataframe of the explicit method so that modifications wont change the original outcome of the function.

amotl commented 3 years ago

Dear Benjamin,

I haven't looked into the details yet, I just tried to get all things into issues reported by @wetterfrosch.

as for my local version I could not replicate this bug

Doesn't this also croak on your end? For me, it does i.e. I have been able to confirm the behavior.

wetterdienst dwd readings \
    --parameter=precipitation --resolution=10_minutes --period=recent \
    --latitude=50 --longitude=10 --distance=100

Can you check the output of self.all() for your case? Maybe it is related to some modifications that were done to the original metadata dataframe?

I will do so when looking into the details, thanks. Otherwise, any help from your side is also appreciated.

With kind regards, Andreas.

gutzbenj commented 3 years ago

No croaks.

This: wetterdienst dwd stations --parameter=precipitation --resolution=10_minutes --period=recent --latitude=50 --longitude=10 --distance=15 returns: [ { "station_id":191, "from_date":"2004-11-04T00:00:00.000Z", "to_date":"2020-11-21T00:00:00.000Z", "station_height":217.0, "lat":49.9694, "lon":9.9114, "station_name":"Arnstein-M\u00fcdesheim", "state":"Bayern", "distance_to_location":10.4228978349 } ]

And at this moment no more nearby methods are called

amotl commented 3 years ago

Thanks for confirming that it works on your end. Hm. After clearing the cache using rm -r ~/Library/Caches/wetterdienst, it also works on my machine. Apparently, we have some flaws in this area we should investigate and address soonish.

wetterfrosch commented 3 years ago

Thanks for looking into.

wetterdienst dwd stations --parameter=precipitation --resolution=10_minutes --period=recent --latitude=50 --longitude=10 --distance=15

hmm, would you mind to try this for the now-period, too please? Within recent I feel lucky [edit: and even for the other parameters solar, wind and air_temperature within now, too], but when I try precipitation within now, I get:

$ wetterdienst dwd observations readings --parameter=precipitation --resolution=10_minutes --period=now --latitude=50 --longitude=10 --distance=1000 --target="influxdb://localhost:8086/?database=dwd&table=weather"
Traceback (most recent call last):
  File "/usr/bin/wetterdienst", line 10, in <module>
    sys.exit(run())
  File "/home/wtf/.local/lib/python3.9/site-packages/wetterdienst/cli.py", line 216, in run
    df = get_stations(options)
  File "/home/wtf/.local/lib/python3.9/site-packages/wetterdienst/cli.py", line 351, in get_stations
    stations = stations.nearby_radius(
  File "/home/wtf/.local/lib/python3.9/site-packages/wetterdienst/core/sites.py", line 156, in nearby_radius
    all_nearby_stations = self.nearby_number(latitude, longitude, metadata.shape[0])
  File "/home/wtf/.local/lib/python3.9/site-packages/wetterdienst/core/sites.py", line 93, in nearby_number
    raise ValueError("'num_stations_nearby' has to be at least 1.")
ValueError: 'num_stations_nearby' has to be at least 1.

The error changes slightly when I use the maximum-number-of-results-limit:

$ wetterdienst dwd observations readings --parameter=precipitation --resolution=10_minutes --period=now --latitude=50 --longitude=10 --distance=1000 --num=1000 --target="influxdb://localhost:8086/?database=dwd&table=weather"
2020-12-22 17:37:27,908 [wetterdienst.core.sites       ] WARNING: No weather stations were found for coordinate 50.0°N and 10.0°E 
2020-12-22 17:37:27,910 [wetterdienst.cli              ] ERROR  : No data available for given constraints
Traceback (most recent call last):
  File "/home/wtf/.local/lib/python3.9/site-packages/wetterdienst/cli.py", line 259, in run
    df = readings.collect_safe()
  File "/home/wtf/.local/lib/python3.9/site-packages/wetterdienst/dwd/observations/api.py", line 466, in collect_safe
    raise ValueError("No data available for given constraints")
ValueError: No data available for given constraints

Thanks!

$ wetterdienst --version
wetterdienst 0.11.1

ps.: Clearing the cache doesn't help.

wetterfrosch commented 3 years ago

ps.: as you might suspect from the very wide parameters the attempt is to collect "all the dataz". So I could alternatively articulate the need for an "all"-the-stationz-option. But this won't help this nice nearest-neighbour-search :)

gutzbenj commented 3 years ago

Dear @wetterfrosch , I have just removed a DWIM date filtering that we have applied in the cli within #293 . Now everything should work as expected!

Cheers, Benjamin