Closed bram-tv closed 1 year ago
When using Hourly and fetching data of today it warns that it can not load data of the year 2024.
Hourly
(For a random weather station)
from datetime import datetime from meteostat import Hourly start = datetime(2023, 9, 20, 0, 0, 0) end = datetime(2023, 9, 20, 1, 0, 0) Hourly('01001', start=start, end=end).fetch()
Running the code:
$ rm -rf ~/.meteostat/cache/hourly/ $ python3 fetch.py Warning: Cannot load hourly/2024/01001.csv.gz from https://bulk.meteostat.net/v2/
It's attempting to load data for 2024 which obviously isn't available yet..
Relevant code: https://github.com/meteostat/meteostat-python/blob/051cd235eff2fd9f2c85e3a887e2e27b32b2144d/meteostat/interface/hourly.py#L131
It's using range(end.year - start.year + 2) which is what is causing the issue..
range(end.year - start.year + 2)
Looking on why the + 2 was added: it was changed from + 1 to + 2 in commit ceb9277faf6aab39a584cc99d37a7ca9cb661a50 for issue #106.
+ 2
+ 1
Looking at the commit message/issue doesn't immediately reveal why but digging a bit deeper: the +2 is needed when a leap year is involved.
Reproducing it with dates from #106 and the original code:
>>> from datetime import datetime >>> start = datetime(2018, 1, 1) >>> end = datetime(2021, 6, 6) >>> [(start + timedelta(days=365 * i)).year for i in range(end.year - start.year + 1)] [2018, 2019, 2020, 2020]
Contains a duplicate 2020 year and the last item is set to 2020 where it should be 2021;
With the changed code:
>>> from datetime import datetime >>> start = datetime(2018, 1, 1) >>> end = datetime(2021, 6, 6) >>> [(start + timedelta(days=365 * i)).year for i in range(end.year - start.year + 1)] [2018, 2019, 2020, 2020, 2021]
The last item is 2021 but the year 2020 is still duplicated..
Running the changed code for today:
>>> from datetime import datetime >>> start = datetime(2023, 9, 20) >>> end = datetime(2023, 9, 20) >>> [(start + timedelta(days=365 * i)).year for i in range(end.year - start.year + 2)] [2023, 2024]
The last item is 2024 which isn't what was asked for..
days=365 * i
Use start.year + i (and + 1 instead of + 2), i.e:
start.year + i
>>> from datetime import datetime >>> start = datetime(2018, 1, 1) >>> end = datetime(2021, 6, 6) >>> [start.year + i for i in range(end.year - start.year + 1)] [2018, 2019, 2020, 2021] >>> start = datetime(2023, 9, 20) >>> end = datetime(2023, 9, 20) >>> [start.year + i for i in range(end.year - start.year + 1)] [2023]
An alternative fix could be to do something like: start.replace(year=start.year+1).year but that will fail if start is 29 February
start.replace(year=start.year+1).year
start
Thank you! Your fix was shipped in version 1.6.6.
When using
Hourly
and fetching data of today it warns that it can not load data of the year 2024.Example
(For a random weather station)
Running the code:
It's attempting to load data for 2024 which obviously isn't available yet..
Root cause
Relevant code: https://github.com/meteostat/meteostat-python/blob/051cd235eff2fd9f2c85e3a887e2e27b32b2144d/meteostat/interface/hourly.py#L131
It's using
range(end.year - start.year + 2)
which is what is causing the issue..Looking on why the
+ 2
was added: it was changed from+ 1
to+ 2
in commit ceb9277faf6aab39a584cc99d37a7ca9cb661a50 for issue #106.Looking at the commit message/issue doesn't immediately reveal why but digging a bit deeper: the +2 is needed when a leap year is involved.
Reproducing it with dates from #106 and the original code:
Contains a duplicate 2020 year and the last item is set to 2020 where it should be 2021;
With the changed code:
The last item is 2021 but the year 2020 is still duplicated..
Running the changed code for today:
The last item is 2024 which isn't what was asked for..
TLDR
days=365 * i
which is incorrect for leap yearsPossible fix
Use
start.year + i
(and+ 1
instead of+ 2
), i.e:An alternative fix could be to do something like:
start.replace(year=start.year+1).year
but that will fail ifstart
is 29 February