cmu-delphi / delphi-epidata

An open API for epidemiological data.
https://cmu-delphi.github.io/delphi-epidata/
MIT License
100 stars 66 forks source link

Permit caching of COVIDcast signals for a few hours #159

Open krivard opened 4 years ago

krivard commented 4 years ago

None of the signals update more than once a day, so we could get a substantial performance boost in the map if we allowed caches to stay good for a few hours.

sgratzl commented 4 years ago

could be done in the .htaccess or via setting the header in the PHP file.

e.g. https://www.askapache.com/hacking/speed-site-caching-cache-control/

capnrefsmmat commented 4 years ago

Just added that to #171. It should also improve responsiveness when switching back and forth between signals on the map.

capnrefsmmat commented 4 years ago

I wasn't able to enable caching yet; I think ExpiresByType requires AllowOverride to be set in the Apache config. Someone needs to look into what settings are required and test them out on staging.

sgratzl commented 3 years ago

what is the status here? if you don't have access to the webserver setting the value via PHP would be an option as in

header('Cache-Control: public, max-age=86400');

or so

capnrefsmmat commented 3 years ago

I think that's feasible. To do it through Apache would require coordinating some configuration changes with Brian, testing those on staging, and so on, but header() is much easier.

I guess the question is how long we'd like to cache responses. Probably no more than a few hours, since signals update daily and someone who comes just before the update shouldn't have to wait 24 hours to get it?

sgratzl commented 3 years ago

you roughly know when you put in new data each day. So you could just set the Expires header to that date: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Expires

capnrefsmmat commented 3 years ago

Each signal pipeline delivers data on a different schedule, though, so we'd have to build that into the code -- some kind of configuration file specifying expected delivery times for each pipeline. And then we'd have to think about what happens when a pipeline is late and how the headers should work.

I think a simple first pass would just use a default short expiry, and we can go from there.

krivard commented 3 years ago

...yeah I think we just set the expiry to the max expected visit length. I don't think we expect someone to be continuously browsing the map for more than an hour. It's okay if the first request of their next visit takes a touch longer to load.

On Wed, Sep 30, 2020 at 8:56 AM Alex Reinhart notifications@github.com wrote:

Each signal pipeline delivers data on a different schedule, though, so we'd have to build that into the code -- some kind of configuration file specifying expected delivery times for each pipeline. And then we'd have to think about what happens when a pipeline is late and how the headers should work.

I think a simple first pass would just use a default short expiry, and we can go from there.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cmu-delphi/delphi-epidata/issues/159#issuecomment-701372826, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI24CQHBI4DRPARQ5WDSFLSIMTJPANCNFSM4PGCQGAA .

krivard commented 1 year ago

There's probably a way to do this in Flask, but it looks like we don't yet? Here's what I get:

$ curl -sLI "https://api.covidcast.cmu.edu/epidata/covidcast/?signal=jhu-csse:confirmed-incidence-num&geo_type=nation&geo_value=us&time_type=day&time_value=20230101"
HTTP/2 200 
date: Thu, 02 Feb 2023 18:26:48 GMT
content-type: application/json
content-length: 92
set-cookie: AWSALBTG=qacugPKQMWrsVqjUA8+5ECCJYZAGov2eBXdEFfLrsS9tVe74n2H2gu0UVvp6MX9YTUWz+707UXh6v4txX4efQ5yh/OvgOTaq51vKy0QHVoY+7qoVjk2BhVXpsdJaDody+4ay5bZgxS/L+U98Iha0RtGISny+LDZhpKMObub+2TnVTu+B8H8=; Expires=Thu, 09 Feb 2023 18:26:47 GMT; Path=/
set-cookie: AWSALBTGCORS=qacugPKQMWrsVqjUA8+5ECCJYZAGov2eBXdEFfLrsS9tVe74n2H2gu0UVvp6MX9YTUWz+707UXh6v4txX4efQ5yh/OvgOTaq51vKy0QHVoY+7qoVjk2BhVXpsdJaDody+4ay5bZgxS/L+U98Iha0RtGISny+LDZhpKMObub+2TnVTu+B8H8=; Expires=Thu, 09 Feb 2023 18:26:47 GMT; Path=/; SameSite=None; Secure
server: nginx/1.22.1
vary: Accept-Encoding
access-control-allow-origin: *
access-control-allow-methods: GET, POST, OPTIONS
access-control-allow-headers: DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range
access-control-expose-headers: Content-Length,Content-Range

ie we say it's okay for a request to include a Cache-Control header but we don't send one in the response.

melange396 commented 1 year ago

related: caching headers for metadata

melange396 commented 12 months ago

related: discussion of caching options and pros/cons thereof