microsoft / Windows-Containers

Welcome to our Windows Containers GitHub community! Ask questions, report bugs, and suggest features -- let's work together.
MIT License
378 stars 59 forks source link

FROM/layer-extraction on ltsc2019 fails: link operation for `Windows/INF/basicrender.inf` on cross-platform building from Linux #493

Open markmandel opened 1 month ago

markmandel commented 1 month ago

Describe the bug

COPY commands have been working normally for years with our lts2019 windows container build on Agones, but started failing today with an error message of:

Dockerfile.windows:18
--------------------
  16 |     FROM mcr.microsoft.com/windows/servercore:${WINDOWS_VERSION}
  17 |
  18 | >>> COPY ./bin/sdk-server.windows.amd64.exe /agones/sdk-server.exe
  19 |     COPY ./bin/LICENSES ./bin/dependencies-src.tgz /agones/
  20 |
--------------------
ERROR: failed to solve: failed to compute cache key: mount callback failed on /tmp/containerd-mount3356703247: link /tmp/containerd-mount3356703247/Windows/INF/basicrender.inf /tmp/containerd-mount3356703247/Windows/System32/DriverStore/FileRepository/basicrender.inf_amd64_efdc64af60c69a6d/basicrender.inf: no such file or directory
make: *** [Makefile:619: build-agones-sdk-image-windows-ltsc2019] Error 1

Where ${WINDOWS_VERSION} is ltsc2019

To Reproduce

To reproduce this, just try copying something in:

ARG WINDOWS_VERSION=ltsc2019
FROM mcr.microsoft.com/windows/servercore:${WINDOWS_VERSION}

COPY ./emptyfile /emptyfile

(Assuming you have a buildx builder already)

❯ touch emptyfile
❯ docker buildx build --platform windows/amd64 --builder windows-builder-ltsc2019 --tag=windows-test .
[+] Building 18.6s (6/6) FINISHED                                                                                                                                                                                                                                                                                                              docker-container:windows-builder-ltsc2019
 => [internal] load build definition from Dockerfile                                                                                                                                                                                                                                                                                                                                0.1s
 => => transferring dockerfile: 156B                                                                                                                                                                                                                                                                                                                                                0.0s
 => [internal] load metadata for mcr.microsoft.com/windows/servercore:ltsc2019                                                                                                                                                                                                                                                                                                      0.1s
 => [internal] load .dockerignore                                                                                                                                                                                                                                                                                                                                                   0.0s
 => => transferring context: 2B                                                                                                                                                                                                                                                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                                                                                                                                                                                                                                   0.0s
 => => transferring context: 28B                                                                                                                                                                                                                                                                                                                                                    0.0s
 => [1/2] FROM mcr.microsoft.com/windows/servercore:ltsc2019@sha256:3c97a5c1c32ddb346c190f00a588da6e682a9a8160869f4969edfd7c6e4d1c03                                                                                                                                                                                                                                               18.3s
 => => resolve mcr.microsoft.com/windows/servercore:ltsc2019@sha256:3c97a5c1c32ddb346c190f00a588da6e682a9a8160869f4969edfd7c6e4d1c03                                                                                                                                                                                                                                                0.0s
 => => extracting sha256:0dd0445527a5079720e935502b31de927b8e22e5ca358026cf0bc8845c5ba5ce                                                                                                                                                                                                                                                                                          18.3s
 => ERROR [2/2] COPY ./emptyfile /emptyfile                                                                                                                                                                                                                                                                                                                                         0.0s
------
 > [2/2] COPY ./emptyfile /emptyfile:
------
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
Dockerfile:4
--------------------
   2 |     FROM mcr.microsoft.com/windows/servercore:${WINDOWS_VERSION}
   3 |
   4 | >>> COPY ./emptyfile /emptyfile
   5 |
--------------------
ERROR: failed to solve: failed to compute cache key: mount callback failed on /tmp/containerd-mount1556330329: link /tmp/containerd-mount1556330329/Windows/INF/basicrender.inf /tmp/containerd-mount1556330329/Windows/System32/DriverStore/FileRepository/basicrender.inf_amd64_efdc64af60c69a6d/basicrender.inf: no such file or directory

Expected behavior

The image should build 😃

Configuration:

Additional context

You can see a full build output from the Agones build pipeline here: https://console.cloud.google.com/cloud-build/builds/70b984e2-132b-4d1a-915a-862cb03f4830;step=16?e=13803378&project=agones-images

Using the previous SHA of @sha256:6fdf140282a2f809dae9b13fe441635867f0a27c33a438771673b8da8f3348a4 worked.

markmandel commented 1 month ago

Also possibly worth noting ltsc2022, works still. We tested it!

jsturtevant commented 1 month ago

which image is Using the previous SHA of @sha256:6fdf140282a2f809dae9b13fe441635867f0a27c33a438771673b8da8f3348a4 worked. in reference too?

markmandel commented 1 month ago

which image is Using the previous SHA of @sha256:6fdf140282a2f809dae9b13fe441635867f0a27c33a438771673b8da8f3348a4 worked. in reference too?

mcr.microsoft.com/windows/servercore:ltsc2019

markmandel commented 1 month ago

For reference, we have this PR now in place to unblock CI: https://github.com/googleforgames/agones/pull/3829

claudiubelu commented 1 month ago

Tried it myself on a build node I have been using for years, though I have updated docker buildx since:

docker buildx version
github.com/docker/buildx v0.12.1 30feaa1a915b869ebc2eea6328624b49facd4bfb

I did use this version before KubeCon, and I did use mcr.microsoft.com/windows/servercore:ltsc2019 as a base image for the presentation where I talked about building Windows images. So, I think the image is the issue (Microsoft publishes a new image monthly).

jsturtevant commented 1 month ago

This looks to be an issue with the image patches released Yesterday (May 14th). Work around is use April's patch images as done in https://github.com/microsoft/Windows-Containers/issues/493#issuecomment-2112937642

jsturtevant commented 1 month ago

/cc @akarshm

jsturtevant commented 1 month ago

Adding link to the slack discussion where we narrowed it down to the patch release https://kubernetes.slack.com/archives/C0SJ4AFB7/p1715733067064539

jsturtevant commented 1 month ago

/cc @profnandaa

profnandaa commented 1 month ago

UPDATE: we've been investigating this issue, a few things worth noting:

  1. The issue not actually with COPY but it is happening at the tail-end of FROM and so the reporting just estimates to the next line on the Dockerfile.
  2. The issue is to do with extraction of the latest delta layer on ltsc2019:

    => extracting sha256:0dd0445527a5079720e935502b31de927b8e22e5ca358026cf0bc8845c5ba5ce 

    There's only one "offending" file Windows/INF/basicrender.inf (and its hardlinks); it so happens that on the TAR header (metadata), it's referenced as lowercase basicrender.inf, when the actual file in the layer is BasicRender.inf. Since Windows is not case-sensitive when it comes to files, this issue is only evident when the link operation is done on Linux (which is case-sensitive). Therefore, it reports as file-not-found. RCA is still going on to fix the issue.

    PS. will be nice to re-tittle the issue to FROM/layer-extraction on ltsc2019 fails: "mount callback failed"

jsafrane commented 1 month ago

Hello, is there any plan / timeline to fix Windows images to be usable on Linux? We as Kubernetes community cannot release our images based on the mcr.microsoft.com/windows/servercore:ltsc2019@latest tag. It's not super critical, at least now, but you never know when a CVE comes and we will need to release images immediately.

profnandaa commented 4 weeks ago

@jsafrane -- a fix is currently going through validation; will update here once it's released.

andriisoldatenko commented 1 week ago

hi, @profnandaa, can I gently ask you about a rough ETA for a fix?