Closed tlestang closed 1 year ago
If we want, we could cache at the HTTP layer where we call the API. This seems quite easy (see #30).
Yes - I actually like your approach a lot better. It maybe doesn't allow for as much data reuse but it's much simpler. And I guess it covers most of the caching use case (i.e. several request in the same first or second half of the hour).
Merged the requests caching stuff. I think this is resolved.
Currently a new request to carbonintensity.org.uk is made each time
cats
is run.In cats/__init__.py
:Although the carbon intensity data obtained from the API is written on disk, this not taken advantage of. Instead, if the relevant carbon intensity data is already on disk, we'd like to reuse this data instead of making a new request each time.
The local carbon intensity forecast data is reusable if the last forecast datetime is beyond the expected finish datetime of the application, i.e.
forecast_end > now() + runtime
.A possible approach is to reshuffle the responsabilites of both top-level functions
api_query.get_tuple
andparsedata.writecsv
.get_tuple
could be responsible for ensuring that the right data is present on disk, and download it if not.writecsv
only cares about computing the best job start time, assuming correct intensity data is available. For instance,then becomes
This approach has the benefit is maitaining a good separation between talking to the API – and caching intensity data – and the calculation of the start time. We currently do almost have this, expect that the function returning the start time is also responsible for writing the intensity data on disk.
Another possible approach is to push the API query and data caching down to the current
writecsv
function: