Open jlebon opened 5 years ago
I think what we want is the ability to invalidate the CloudFront cache as part of our release process?
@puiterwijk, does that sound reasonable? If so, I can submit a releng request to get the creds to do this.
I think it's the more normal pattern to have any "mutable" objects correctly set their caching headers, and have the mutable bits (usually metadata) refer to immutable URLs that can be cached forever.
That's what I did for the RHCOS pipeline anyways.
Link to RHCOS pipeline code: https://url.corp.redhat.com/1459bbd
Right, specifically talking about the mutable bits here. We can definitely just settle on some small interval greater than 0, but would be nice if we could do even better. IIRC I think Flathub does something similar for its summary file? (@ramcq, does that sound correct?).
Hmm, this actually also intersects with Cincinnati. We were discussing having rollouts controlled through files in the bucket. And so starting a rollout and pausing a rollout would require editing a file. If we want Cincinnati to pick up those changes quickly, then we'll have to use e.g. max-age=0
or just pointing it at the bucket directly or explicitly invalidating it.
I think it's the more normal pattern to have any "mutable" objects correctly set their caching headers, and have the mutable bits (usually metadata) refer to immutable URLs that can be cached forever.
I don't want to lose this bit though. I think we could make buildupload
set better default cache headers when uploading as appropriate for each file. Will file something for this.
OK, opened https://github.com/coreos/coreos-assembler/pull/680 and https://github.com/coreos/mantle/pull/1038, so at least we'll have more sensible caching for now.
Sorry for delay; just to provide the context here. We set TTLs as follows - https://github.com/flathub/ansible-playbook/blob/master/roles/repo-manager/templates/nginx-default.d-repo.conf.j2. For the mutable parts of the repo - the summary and its signature, and the refs, we set a shorter timeout with a longer "stale if error" retention period in case of temporary origin wobbles.
We used to have a very short TTL on the summary file until we realised that about 20-30% of the origin traffic was refreshing the summary on every edge node every minute. So we moved to a 1hr TTL and an explicit cache PURGE when the summary was updated. https://github.com/flathub/ansible-playbook/blob/master/roles/repo-manager/templates/post-publish.sh.j2#L21-L23 (very high-tech this bit)
@jlebon - is there outstanding work to do here?
Metadata files like
streams/testing.json
andreleases.json
are at stable URLs and updated in place during releases. https://builds.coreos.fedoraproject.org/ is handled by CloudFront, so we have to think about caching.We've been working around this for stream metadata by just using
--cache-control max-age=0
when uploading, but that's clearly not ideal. We do want caching, just smarter...I think what we want is the ability to invalidate the CloudFront cache as part of our release process? Something like: