Closed jarek closed 6 years ago
Thanks a lot jarek for your question, we haven't finished the specifications, but maybe your input could help. And your questions definitely help. I'm currently changing the ENTSOE.py
code so that's not a reference.
In general, since there are many parsers and only one function to launch them all (let's call it launch_parser
), the logic should be as much as possible in launch_parser
so that the parser get be as simple as possible.
Ideally, the parser should return as many values as possible (but keeping it simple, only one query) starting from the target_datetime
. If with a single query the parser can get data for a whole day (three days / 5 hours ...), it should return the data for the whole day (three days / 5 hours ...).
If the parser only fetches a single datapoint, it should return a single datapoint.
We will keep a hard-coded record of the timespan each parser can fetch. The idea is that if we're missing a week, we'll launch the parser once for every day if it can return for a whole day, or once for every hour if it can handle only a single datapoint.
If the parser cannot fetch for the required datetime (whatever the reason), it should return None
.
So regarding more specifically your questions :
target_datetime
(whatever the time difference between the two, launch_parser
will throw away values that are too far from the requested datetime) If two datapoints are as close, return any or both. When returning the datapoint, the datetime
value should correspond to the datapoint datetime, not the requested datetime.target_datetime
if you can only get data for the last 24 hours. In that case, whatever the (non-None) target_datetime
, always return data for the last 24 hours is perfectly valid (launch_parser
will throw away the values too far from the target_datetime
)This may not be super clear, don't hesitate if something's not clear or if you feel we can do something easier / better.
@corradio I believe it's what we talked about, don't hesitate to react if it's not or if something wasn't clear
I think that pretty much sums it up. Thanks @maxbellec ! I might add that in this iteration we're trying to stay as agile as possible and so we're optimising for simplicity rather than future scalability. With that in mind, we might want to add more information to parsers themselves in the future to optimise things further - but for now, we're keeping it simple.
Okay, thanks!
To try to summarize:
target_datetime
, with guideline being the amount of data returned in one HTTP request by source APII think that makes sense - certainly it does for now.
We talked about it again with @corradio. @jarek I'll steal your summary and add:
target_datetime
means datetime for the latest data the parser will return. So if the parser returns data for 24h hours, it should return data from 24 hours before target_datetime
until target_datetime
. The idea is that live data can be treated by simply doing target_datetime=datetime.datetime.now()
target_datetime
, with guideline being the amount of data returned in one HTTP request by source APII'll adapt example.py
as a consequence
This looks fine now after #1237, I'll close it. Thanks!
Hi,
I was thinking about implementing
target_datetime
in some of the parsers, and came up with the following questions. If you let me know what you prefer, I can update comments in example parser and README to match :+1:target_datetime
doesn't exactly match available data?closest_in_time_key
logic that would do the former - should that be the guideline?target_datetime
parameter. Is this required, or is it fine to return a list normally but a dict when called withtarget_datetime
?