ko-build / ko

Build and deploy Go applications
https://ko.build
Apache License 2.0
7.61k stars 400 forks source link

Question: can we extract KOCACHE instead of saving it to disk to be able use it between stateless builds? #809

Open developer-guy opened 2 years ago

developer-guy commented 2 years ago

AFAIK, KOCACHE only accepts directories as a value. But we couldn't use this cache between stateless builds. For example, each workflow run executes in a fresh VM in GitHub Actions.

It'd be nice if we could use OCI registries to save the cache for ko builds.

imjasonh commented 2 years ago

cc @jonjohnsonjr

developer-guy commented 2 years ago

kindly ping @jonjohnsonjr

developer-guy commented 2 years ago

I think we can store this information in the annotations of the image manifest (which might be verbose) or labels of the image config, as BuildKit did, and read that information over here instead of the disk.

AFAIK, ko uses some mapping between the buildIDs to diffIDs and diffIDs to the descriptor, so, we can use the diffID as key and descriptor JSON as a value in the annotations or the labels sections.

WDYT?

jonjohnsonjr commented 2 years ago

Hey sorry, I was on leave for a while 😅

Yes, this is definitely something we could do! I jotted down some notes a while back, let me just dump them here:

KOCACHE_REPO=gcr.io/jonjohnson-test/kocache

Tag = BuildId

Repo acts as map[BuildId]CacheManifest

CacheManifest {
  InlinedConfig
  Layer[0] = OriginalDescriptorBlob
  Layer[1..n] = ForeignDesc # Optional...
  Annotations
}

InlinedConfig gives access to diffids.

Layer[0] preserves the descriptor of the layer we built so it keeps the media type, annotations, etc. (Need some way to invalidate that if preserveMediaType is flipped? Annotation that is a checksum?) Inlined for speed via data field.

Layer[1..n] track where we've pushed this layer before, maybe we keep it in the kocacherepo as well. Annotations for staleness? Use foreign layer to reference other repos/registries in urls.

Annotations contain ko and go version metadata for determining staleness.

Cache repo can be used to store all built images as well so we can easily mount into target registry if possible but just copy if not.

New `ko cache` command for managing this repo (GC, sync, etc) and the dir. Directory is local so much faster to try first and we can cache through to it if both are set. May need a way to shard cacherepo for large number of tags?

I think what you're describing is very similar to what I had in mind here?

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Keep fresh with the 'lifecycle/frozen' label.