Closed bkneis closed 2 weeks ago
LGTM!
@bkneis approved with comments, feel free to ignore if not applicable
@pascalbreuninger fab thanks! Just finishing testing with a remote k8s cluster on GKE then it should be ready to merge. Do I need to update any file to reference the new version of the kubernetes driver? i.e. https://github.com/loft-sh/devpod-provider-kubernetes/pull/55
@bkneis nope, that's done by releasing a new version of the kubernetes provider over in the other repo👍 We'll need to wait until we've released it though, right?
@pascalbreuninger just finished testing with GKE :partying_face: It was my auto pilot cluster giving me issues and killing workspaces with a return code 137 (OOM). Using the new GKE cluster workspaces are spinning up no problem and using the cache. In the end I didn't need to make any changes to the kubernetes driver, only dockerless so this PR is good to go IMO
This PR contains the changes to facilitate the use of remote caching via a registry to speed up build times. It supports docker and kubernetes (via kaniko) and uses the context options REGISTRY_CACHE as the registry url. Caching options are not exposed currently in order to ensure the expected functionality.
The PR was tested using the following workflow:
Without cache
ghcr.io/devcontainers/features/github-cli:1
as a feature to devcontainer.json 2m45secho "test" > examples/build/app/test
) 2m52sWith cache
ghcr.io/devcontainers/features/github-cli:1
as a feature to devcontainer.json 1m25secho "test" > examples/build/app/test
) 13sAs you can see step 1 is almost the same, since no cache has been made yet. Step 2 however is interesting, we saved around 50% of the build time with the cache but 1m16s was used to upload the cache back to the registry. This shows that it is important for us to either omit the
--cache-to
parameter forup
(but notbuild
), or suppress pushing of the cache manifest until the end of the command in the background once the workspace / IDE has already launched.Also note that I am using my local docker / kind cluster and a remote registry, when the registry is closer to the devcontainer I would expect even greater savings in build time due to shorter download times.
Lastly we have step 3 that simulates a common problematic workflow, which is a user updating the build context then
up
ing a workspace. Here we see significant savings where only building the last layer is needed, instead of the entire image due to a cache miss.EDIT: I have now implemented a boolean
ExportCache
in toggle the--cache-to
parameter, we now only upload the cache when runningbuild
, notup
. This providesup
even faster start times of workspaces as we don't wait until the cache is uploaded.Also note for machine providers I needed to enable the containerd snapshotter flag in the docker daemon, this was done during
initWorkspace
. For non machine providers like local docker we expect the user to enable this themselves. When trying to use the remote cache without this a WARNING will be printed from docker.NOTE: We need to merge https://github.com/loft-sh/dockerless/pull/27 first and update the dockerless release tag in single.go
Lastly, I have updated the
CalculatePrebuildHash
function to traverse the parsed Dockerfile and extract any file paths that could affect the build context. I then filter for these files as an "include" list before adding the path to the hashed contents. This should cause fewer cache misses and allow developers to reuse similar images.