GreenScheduler / cats

CATS: the Climate-Aware Task Scheduler :cat2: :tiger2: :leopard:
https://greenscheduler.github.io/cats/
MIT License
50 stars 8 forks source link

Caching of the HTTP call #30

Closed andreww closed 1 year ago

andreww commented 1 year ago

This is a simple way to cache the API request in a way that would be useable on a multi-user system. There are two 'bits'. First, we make the request for the 'top' of the half hour (so hh:00:00 or hh:30:00) by changing the current time before calling the API, then we make the call through requests_cache rather than requests. Because the time (and postcode) is part of the request this means that old data is reused, avoiding the HTTP call.

Currently the results are just in file cats_cache.sqlits in the current working directory, but we could put this in a hidden file in the users homespace, or in a global temporary directory or whatever.

At this stage think of this as a proof of concept. There are bits that need thinking about (where to put the cache, cleaning up the cache, install the right modules etc.)

abhidg commented 1 year ago

Thanks @andreww, this seems like a good approach. We could put the cache in a temporary folder perhaps, then it can get cleared by the OS. If there is a way to expire the cache after a set time within requests_cache, that could be used.

tlestang commented 1 year ago

Am I right in understanding that this caches the forecast data for a given 30min time window? For instance, if I make a first request at a time $t$, then any future request within $t$ and $t+30min$ will be cached. But any other request at $t_1 > t_0 + 30min$ will trigger a new API request.

This makes perfect sense to me and is actually I think simpler that what I originally had in mind as described in #25 .

andreww commented 1 year ago

Am I right in understanding that this caches the forecast data for a given 30min time window? For instance, if I make a first request at a time t, then any future request within t and t+30min will be cached. But any other request at t1>t0+30min will trigger a new API request.

Yes - that's the idea. Except goes back to the start of the previous period. So if the first call is at 17:16 requests will hit the cache until 17:30. This is because it turns out that the API always starts with data from the previous time period.