Open sdementen opened 3 days ago
To clarify, the year_limited decorator tries to enforce the _start and _end timestamps to the result frame. However other side effects can happen:
To reproduce the bug
import pandas
from entsoe import EntsoePandasClient
client = EntsoePandasClient(api_key="397870bf-afdc-4422-ba9e-2d8ef803fa2a") # API key from Sebastien de Menten (GFJ138), use with care
client.session.verify = False
df = client..query_installed_generation_capacity(
"FR",
start=pd.Timestamp("2017-01-01", tz="Europe/Paris"),
end=pd.Timestamp("2023-01-01", tz="Europe/Paris"),
)
print(df.index)
outputs
DatetimeIndex(['2017-01-01 00:00:00+01:00'], dtype='datetime64[ns, Europe/Paris]', freq=None)
This is due to the fact that for the first block, the api returns a single row of date '2017-01-01 00:00:00+01:00' and the decorator doesn't filter out this value since it's the first frame.
But for the following ones the api returns the first timestamp of each year ('2018-01-01 00:00:00+01:00', '2019-01-01 00:00:00+01:00', …). But those values are filtered out by the condition frame.index > _start
.
You're solution handles that well too.
To reproduce the bug
outputs
where we are missing 2022-12-31 00:00:00+01:00
This is due to the filter https://github.com/EnergieID/entsoe-py/blob/master/entsoe/decorators.py#L139 which may skip the first row of the second and subsequent queries. This could be replaced by the following that explicitly filter index that would overlap with the previous frame
entsoe.version == "0.6.16"