Open weber-software opened 4 years ago
Tested it again with 74f76cf4e9c5624113ad12523dcc23c8ef5a10af but the problem still occures.
I'm also encountering this issue. Baking one target at a time seems to prevent any erroneous cache misses.
At first, I had thought my issue was related to baking targets that spanned across multiple Dockerfile, and thus the multiple contexts was throwing buildkit for a loop. However, even when consolidating all my bake tarkets into a single Dockerfile, I can occasionally catch an instance where a stage that was cached for one target is somehow missed for another target, even when both targets share the same stages and context/inputs.
For example, when baking both validator
and tooler
targets for the following:
target "runner" {
target = "runner"
}
target "prepper" {
inherits = ["runner"]
target = "prepper"
}
target "validator" {
inherits = ["prepper"]
target = "validator"
tags = ["validator"]
}
target "tooler" {
inherits = ["validator"]
target = "tooler"
tags = ["tooler"]
}
I'll occasionally the layer for [tooler runner 4/4]
miss the same layer cached for [validator runner 4/4]
Note the difference between CACHED step #28
vs missed step #40
. In particular, on step #31
the validator
target starts to fetch the cache for the layer runner 3/4
, which is then followed by (another?) step #31
that starts to fetch the cache for layer runner 3/4
as well. On step #37
we can see the layer is successfully concluded as CACHED
, but then later we can see step #31
finalize it's extracting
. The following step #40
then subsequently misses it's cache, unline step #28
that concluded runner 4/4
as CACHED
.
I realize some of the stdout can be out of sync due to line buffering, but I think this hints at some kind of race condition when download and extracting layers that can be used in caching for multiple targets.
@crazy-max or @tonistiigi , I can try including more telemetry if there is recommended method capturing traces, but from the log below you'll see this is using docker.io/docker/dockerfile:1.7
:
I've stumbled on this issue as well. However when switching to a different cache backend I observe the following behaviour:
{
"group": {
"default": {
"targets": [
"lts",
"lts-alpine"
]
}
},
"target": {
"lts": {
"context": ".",
"dockerfile": "Dockerfile",
"args": {
"NODE_VERSION": "lts"
},
"tags": [
"test"
],
"cache-from": [
"type=registry,ref=myregistry/container-build-cache:test"
],
"cache-to": [
"type=registry,mode=max,ref=myregistry/container-build-cache:test,image-manifest=true,oci-mediatypes=true"
]
},
"lts-alpine": {
"context": ".",
"dockerfile": "Dockerfile",
"args": {
"NODE_VERSION": "lts-alpine"
},
"tags": [
"test-alpine"
],
"cache-from": [
"type=registry,ref=myregistry/container-build-cache:test-alpine"
],
"cache-to": [
"type=registry,mode=max,ref=myregistry/container-build-cache:test-alpine,image-manifest=true,oci-mediatypes=true"
]
}
}
}
"target": {
"lts": {
[...]
"cache-from": [
"type=registry,ref=myregistry/container-build-cache:test"
],
"cache-to": [
"type=registry,mode=max,ref=myregistry/container-build-cache:test,image-manifest=true,oci-mediatypes=true"
]
},
"lts-alpine": {
[...]
"cache-from": [
"type=registry,ref=myregistry/container-build-cache:test"
],
"cache-to": [
"type=registry,mode=max,ref=myregistry/container-build-cache:test,image-manifest=true,oci-mediatypes=true"
]
}
}
}
That makes me understand that multiple targets steps on each other's toes when not explicitely configured to use a separate storage area. So I assume that the default internal cache stores all the data in the same storage area and is not suited for use with multiple targets.
@dud225 or @weber-software , have you tried updating the dockerfile version used? For me, updating from v1.7 to v1.9 may have resolved the superfluous cache layer busting I've encountered!
From the change log, I suspect this may have helped in dealing with my more advanced Dockerfile staging DAG:
I was hoping to speed up builds by using the parallelism provided by bake.
Therfore i'm running
buildx bake -f docker-bake.hcl
but nearly all of the time one (or more) of the images gets rebuilt, even if they should hit the cache.
If i specify single targets, the cache is used as i would expect:
for i in {1..21}; do buildx bake -f docker-bake.hcl s$i; done
What if found out so far:
RUN
the problem doesn't occureAre there any ideas why this is happening or how i could investigate this further?