Open TheCodeWrangler opened 7 months ago
Can confirm this exists in the latest image as well:
❯ docker inspect nvcr.io/nvidia/tritonserver:24.03-py3 | jq '.[0].RootFS.Layers' | sort | uniq -c
1 "sha256:0c5f76392da432595f52b598d72f9f0c7bec12cc99eeea430203fc3b6e0a551c",
1 "sha256:2757a78913f29b2b87649e92ee6dd73460f496af008c5c8d772c2cbaf128450f",
1 "sha256:2c8fb4462dd1c3b0257685c36d8e0235bd0fa6cdc86e2934cce01ad67b560599",
1 "sha256:2e983e03121d01381b8931ac29c8514b83d409019851fbe319a19dc973e37acc",
1 "sha256:3aea6fab2dbf23d161e561e6afe0bf060d6c99dbdff31ff6d6dedf4e8e60949d",
1 "sha256:4d799d4505540b9d5743a694d19b18e89dcfeb6266a7133863410d38e0a9f680"
1 "sha256:5498e8c22f6996f25ef193ee58617d5b37e2a96decf22e72de13c3b34e147591",
1 "sha256:54e647f81b1a7e902055e4791115a9fb602e72639b978a72a06aafe4eb4c8246",
4 "sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef",
1 "sha256:6ac20142f853cd947ce4b982bee38eb26e03f0925bf58903199ab26d5f101937",
1 "sha256:7a05d6e72510152995b5887fe51c8ac36779140abc9af34f09d706b2cb67e69a",
1 "sha256:84b1719c52bd5b83ce59ffb55774b691a3fb565073398c7ac6ecc228e620bdb5",
1 "sha256:89539473e1d5b49b8b537d6725feec8ddc903b4cbcfa235766e8f6825e2d6f4a",
1 "sha256:9e1f1090a7f07923f33452bc24eea16c2d8cc08138f3db4e1d8e93a9e430dac7",
1 "sha256:a0c97f620cd64a3cbb897fb87f9b1df454291253a29032fd8200e422c643e7fe",
1 "sha256:a67506fbd03042aad7e8107256fd06d568343a9eebefe608f967f1ee95da27c5",
1 "sha256:a6fd7d221e23ca963955a8d2a7b87b40cfe8f62b37bc3c79d2230aba780556ba",
1 "sha256:b4bd02b17ce352d7dde2362003d87ddba47b587eebdbea7b7ce803800d37ed95",
1 "sha256:babb0ac901f2a703fd049c1f409256a3acf9dcb47709676f2d56b13847ea6806",
1 "sha256:d0a7470596635f4e06a524f182b51f6743b17555f44049c45132b9a5ce65c51f",
1 "sha256:ea31cf21ba0208a998ee3bef804a79155864989972e3e489a6a657ec65dff316",
1 "sha256:f80386fcd8ceddd5b8dc0823325847d348c62253303d47507e4c32ebe3e29cb2",
1 "sha256:f8a1d3a7e2ee27b131917b3d5dc101f38f4d29e8aa79c5ab34287772e353ea5a",
1 "sha256:f97056fec7f9e222a346f108ca493cef3d60e2f3624722a9d746708180a8e8cf",
1 [
1 ]
since it's just the one duplicate layer, duplicated 4 times, I was hoping it'd be easy to identify poking around docker history
but nothing has jumped out to me yet.
Digging a little further here, this layer maps to file 54ef38418033d800a45a536f7c4f8d037549aa2c005f589e390961c0c5947149/layer.tar
when saving and extracting this image. Extracting that file shows it's empty, so we must have 4 Dockerfile commands that don't alter the filesystem and thus result in empty images.
Not sure what's creating these empty layers, but I hope this helps!
Alright I pulled out dive
and found the four offenders:
RUN |3 CUDA_VERSION=12.4.0.041 CUDA_DRIVER_VERSION=550.54.14 JETPACK_HOST_MOUNTS= /bin/sh -c if [ -n "${JETPACK_HOST_MOUNTS}" ]; then echo "/usr/lib/aarch64-linux-gnu/tegra" > /etc/ld.so.conf.d/nvidia-tegra.conf && echo "/usr/lib/aarch64-linux-gnu/tegra-egl" >> /etc/ld.so.conf.d/nvidia-tegra.conf; fi # buildkit
RUN |2 TRITON_VERSION=2.44.0 TRITON_CONTAINER_VERSION=24.03 /bin/sh -c rm -fr /opt/tritonserver/* # buildkit
WORKDIR /opt
WORKDIR /opt/tritonserver
I'm going to assume the WORKDIR changes are false-positives, but looks to me that first RUN is a no-op because of the if statement, and the second is a no-op because opt/tritonserver/
is already empty.
Not sure the best way to solve this. Multi-stage dockerfile?
Per some GKE docs duplicate layers are not an issue with more recent GKE versions, but sure would be great to fix this issue for those of us still on earlier versions! Especially since GKE uses tritonserver as an example of when to use image streaming.
Yet another point of confusion is that the empty layers we have here (5f70bf18...
) don't match the empty layer hash in the GKE docs (a3ed95ca...
). But if these are truly empty layers, than being on a newer GKE/k8s version won't help, because empty layers will still prevent image streaming AFAICT.
as a possible work around have you tried a tool like docker-squash?
https://github.com/goldmann/docker-squash
I did a quick sanity test locally and produces an image with a single layer (though looking at the docs the number of layers is somewhat configurable).
I don't have experience with the streaming feature on GKE - so an image with a single layer may not be desireable for other reasons but thought it worth a try - as we take a look at the request.
@nnshah1 I have not heard of docker-squash, looks interesting, thanks for the link and the reply!
@ClaytonJY - if you can verify if the workaround will work - we can recommend this for versions of GKE that don't support empty layers. Since newer versions do support it - we'd probably consider this lower priority. Fair?
@nnshah1 Even newer versions of GKE can't support duplicated layers. They mentioned this in their documentation, but the feature is disabled, at least until GKE version 1.30. Another issue is that for docker-squash, the image layer IDs are required, but they are missing in Docker versions after 1.10.
+1
*Problem: GKE image streaming will not work with these images due to repeated layers I would like to use GKE image streaming with triton-inference-server images.
This feature will only work if the image does not have duplicated layers https://cloud.google.com/kubernetes-engine/docs/how-to/image-streaming#downloads_the_image_without_streaming_the_data
I am wondering if work could be done to restructure the docker build and ensure that duplicate layers do not exist within the triton inference server images
Results in 4 layers with the same sha256 hash (5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef)