Open developer-guy opened 2 years ago
cc @jonjohnsonjr
kindly ping @jonjohnsonjr
I think we can store this information in the annotations of the image manifest (which might be verbose) or labels of the image config, as BuildKit did, and read that information over here instead of the disk.
AFAIK, ko uses some mapping between the buildIDs to diffIDs and diffIDs to the descriptor, so, we can use the diffID as key and descriptor JSON as a value in the annotations or the labels sections.
WDYT?
Hey sorry, I was on leave for a while 😅
Yes, this is definitely something we could do! I jotted down some notes a while back, let me just dump them here:
KOCACHE_REPO=gcr.io/jonjohnson-test/kocache
Tag = BuildId
Repo acts as map[BuildId]CacheManifest
CacheManifest {
InlinedConfig
Layer[0] = OriginalDescriptorBlob
Layer[1..n] = ForeignDesc # Optional...
Annotations
}
InlinedConfig gives access to diffids.
Layer[0] preserves the descriptor of the layer we built so it keeps the media type, annotations, etc. (Need some way to invalidate that if preserveMediaType is flipped? Annotation that is a checksum?) Inlined for speed via data field.
Layer[1..n] track where we've pushed this layer before, maybe we keep it in the kocacherepo as well. Annotations for staleness? Use foreign layer to reference other repos/registries in urls.
Annotations contain ko and go version metadata for determining staleness.
Cache repo can be used to store all built images as well so we can easily mount into target registry if possible but just copy if not.
New `ko cache` command for managing this repo (GC, sync, etc) and the dir. Directory is local so much faster to try first and we can cache through to it if both are set. May need a way to shard cacherepo for large number of tags?
I think what you're describing is very similar to what I had in mind here?
This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Keep fresh with the 'lifecycle/frozen' label.
AFAIK, KOCACHE only accepts directories as a value. But we couldn't use this cache between stateless builds. For example, each workflow run executes in a fresh VM in GitHub Actions.
It'd be nice if we could use OCI registries to save the cache for ko builds.