Open doctorpangloss opened 7 months ago
Hi @doctorpangloss ,
could you get the content size of this affected blob layer?
wc -c < /storage/docker/registry/v2/blobs/sha256/ee/ee16e8d2117a30f83fe374f2c07067494c109eb6e2efdefd62d63ab26e7ac145/data
Describe the manifest info of this image?
cat /storage/docker/registry/v2/blobs/<xx>/<xx-manifest-digest>
Also preferred to provide the harbor-registry logs both when push and pull this specific image.
Thank you for further investigating the issue.
$ wc -c < /storage/docker/registry/v2/blobs/sha256/ee/ee16e8d2117a30f83fe374f2c07067494c109eb6e2efdefd62d63ab26e7ac145/data
3641339259
$ cat /storage/docker/registry/v2/blobs/sha256/62/627b0ceb463ffe633dafb89029b270147b656a76d55cd5b0a72092df2cae28a2/data
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 11348,
"digest": "sha256:b3fdb2fee5c1acc78cb0f297870cad6adc833c847ea7ce65f72b7cd5b54d8840"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1388598786,
"digest": "sha256:7c76e5cf7755ce357ffb737715b0da6799a50ea468cc252c094f4d915d426b3f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 568860197,
"digest": "sha256:a61557bf66429be9509f579104808d2853f8f7aefbd49ef26f5f2a90266c46f5"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 21424280,
"digest": "sha256:5bc010802431ab0ee2b8ef0d775b412b3c56a8eac2428088b0c949817219c295"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 73406695,
"digest": "sha256:87017d1dc4c5662506aee9340e592e517444e7d9e7485b741fd2c825ebf7bbcf"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 3641333543,
"digest": "sha256:ceea7cc146971131181173af7bb5432197c486f43e47e14dc24089f95f23f7fa"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 6392387805,
"digest": "sha256:6a016da1c63b584aea891c77bf01dc8d248da63c5c7350211374969798167da0"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 5597422203,
"digest": "sha256:be3f48ccfaa562e13a3d19bb4bcea93a575f9e05846b3c0f12dd4ae40ea3fa03"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 4193631836,
"digest": "sha256:c861bab5f65eaa8e18d08597efcd7901ad18266dfdb053ea39a823cb234d2744"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 7521664388,
"digest": "sha256:450d94ee1f10724478d0117f89a06b0bb1ede75568f78fb33b790347948bb2e6"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 62127545,
"digest": "sha256:e4e2d538c233ec1992783294e0bcadd2a510c62f9dcec790e9bac815c34eaad5"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 189818861,
"digest": "sha256:6bec8920d36011535acb6e334cc03c8165002f8a3e214c4298082abc7c8c9663"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 683934131,
"digest": "sha256:270fa103a76a3e8407332db541eab8ac9946a62cdfefd13cc765ca9e29919c0f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2947552011,
"digest": "sha256:516646a609227c2e6e4e05cb8fd844e0c6b3145b962efc6b439603ad06913541"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 558733195,
"digest": "sha256:d73132d27ff250a67d40ed517666e6995c47b6e41bcd753201ae398cb8dd91a8"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 558997790,
"digest": "sha256:88835ae46eb9809639e615dc1feb6cd92763ddb578229e44cd8b9e607250e2b4"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 552724987,
"digest": "sha256:83ec24083d455ce8ba98d1228499000bf5d489c98dcd2f8bf3021c20aaf6b601"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2952371783,
"digest": "sha256:660ffa7fd6ea9d28302a7f0a98ecf937bcc3849276f87ecf21f75507ffc0a22e"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 780788150,
"digest": "sha256:3b5f5cc626fa39fd8eaa7f342ab0016e4a7567f846b4a1ef3382f88dd936a616"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 782108407,
"digest": "sha256:7b6710162dd691f5958da2b1ee94d1d72c9e5608b0fd12dd28941d28c2654228"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 417311,
"digest": "sha256:2634f7688c381b03a5fe14c998279cde213edd60130c7f2bee2d5fe6da1aed85"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1368,
"digest": "sha256:152a138552e49987085eeb5e6414af0781bf310ea27e5bd44d082e18b1cc1ba1"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 47774571,
"digest": "sha256:986501079208002feb43a04c127e225435ffe40d4086f5630d9a12286c4446d3"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 43769789,
"digest": "sha256:e9993a0af675071811d2a98e7007f89ca8ee0426b958af400e38be14a10994eb"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1856,
"digest": "sha256:6ee7c2921a2c4134d3b90574334d037dd9eb508aef563adecbffee685a470b7a"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2679,
"digest": "sha256:b0119e8423326103aef1d24daa7c3ce07eb567dda0cda5f99e171fad4acc819c"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 2202,
"digest": "sha256:b891de3b6a71c2470e422ac2570026db36f1e265b8897418b0275a060d7bcc18"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 337073,
"digest": "sha256:cfe6539fbcedc3a905e2e0c891c1206c4947b4b14f08df65055866bb782be00f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1556,
"digest": "sha256:eea6a71e201d02262532740776b4ba8a99d03a564659224278a69aa8abdf6481"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 177482365,
"digest": "sha256:daeca83d80187859b81c21d2a0e9ee17ae45f79db096aec020b16f3039da136f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 352796701,
"digest": "sha256:8a9791962e95f33e683a11fd1438b8e6a13a24c1734abc30040da5dcaf3547b3"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 712071897,
"digest": "sha256:0f3dc3a10bb747600bfe8269eee7c85fd44536c7826edfdf06fc1d9e3f739c65"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 178359805,
"digest": "sha256:aa83849f3863a9ce16357323deb12c305c9ebb40e39340b5fa8a81f05bdade00"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 5619685,
"digest": "sha256:7b6d7f4990df9fb7fdea58e6a95a874d5e72250be20f45201d9ea8b5d5c8f239"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 356075907,
"digest": "sha256:781a95c1bfdc2835049b5c8054a4b1341b38c6a197c9963d8a3a7176695aaa85"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1470,
"digest": "sha256:9af0cd1263bfb25877fd238e875cc4b12b56d7bf37099da933a457d8c221b67e"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1314,
"digest": "sha256:b1b66ff1bed073e4c8c8e1c3034104454c40d86a08efb57626bae7d945252a73"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1340,
"digest": "sha256:3ef235c23399ac69e0c24121d8b8d93a74b3c13ac724d29d314cecef94328d2f"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 339111,
"digest": "sha256:c807663cd98abfa55e5b29eb0758bc16057ed80a00ee86caa2acacffbc65394d"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 605862,
"digest": "sha256:b61ebed909295dca5d1597ab9c0e465e82ecc68dedceddc9468b5649d2323660"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1293,
"digest": "sha256:96008518d4134a7ea8358f9df2a92c6c496dcc999faa46cff40e3500f7bbb4cc"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 1313,
"digest": "sha256:ca82bb53477dd0f76ae02489eac98493cdac736295585fb0988aab94bdfc8ac5"
}
]
}
registry_logs.txt These logs are from an isolated instance of registry interacting just with pushing and pulling this image.
I am also going to try Docker 20.x to build the image, because those images, which also contain large content, routinely build successfully.
After pushing many times it appears kind of random which layer in the final image has an incorrect checksum.
This is pointing to serious bugs in harbor
. By using ECR, I've eliminated a lot of other possibilities.
The manifest doesn't have a matching layer probably because it has already been replaced by another attempt. I don't think it will be super helpful to see that there is a matching layer and that there are no flaws in the manifest. assume that such a thing exists for now.
I agree with it likely to be a bug in upstream distribution or docker side has image been corrupted when uploaded.
Would you try to build a similar image without using windows os based image? Or build a windows os based image without large size content?
https://github.com/redis/redis/issues/13156
Where are also facing this issue with docker.io proxy in our harbor with redis image
Also could you build a distribution registry to try push and pull for verification? https://github.com/distribution/distribution/blob/v2.8.3/BUILDING.md
How to reproduce the issue:
pull proxy/docker.io/library/redis:7.2.4-alpine over a docker.io proxy in harbor Run this image in conatainerD Runtime.
We have also deleted this redis:7.2.4-alpine image and run a gc. And let i pulled it new from docker.io Issue is still there
Seems like this SHA (multi arch cha) redis:7.2.4-alpine@sha256:641c365890fc79f182fb198c80ee807d040b9cfdb19cceb7f10c55a268d212b8 Has Issue over a Harbor Docker.io Proxy
The amd64 sha works fine
redis:7.2.4-alpine@sha256:3487aa5cf06dceb38202b06bba45b6e6d8a92288848698a6518eee5f63a293a3
@doctorpangloss is your project also set up as proxy cache?
@doctorpangloss is your project also set up as proxy cache?
No.
When pushed and pulled from ECR, everything works.
Is there something in harbor that causes it to not write the layer exactly as it was uploaded?
If so, how do I disable this?
Thanks again for investigating this with me.
Would you try to build a similar image without using windows os based image? Or build a windows os based image without large size content?
@MinerYang this issue reproduces with a Linux version of the image with no build stages and no large files.
Is there anything that causes harbor to touch the contents of what it is writing to the registry?
An Ingress configuration is possible but I routinely push other images. Something essential about this is that it is installing Python packages, which have many files and create many links.
FROM nvcr.io/nvidia/pytorch:24.01-py3 as builder
ARG PIP_DISABLE_PIP_VERSION_CHECK=1
ARG PIP_NO_CACHE_DIR=1
RUN pip install wheel && \
pip install --no-build-isolation git+https://github.com/hiddenswitch/ComfyUI.git
WORKDIR /workspace
RUN comfyui --quick-test-for-ci --cpu --cwd /workspace
EXPOSE 8188
CMD ["comfyui", "--listen", "--cwd", "/workspace"]
some Python packages have been removed from this Dockerfile. I've had this pushed just fine a few weeks ago. Between the not working and working versions, I upgraded to harbor 2.10. So it's most likely to be a new bug. The urgency is gone because I am using ECR for now, but I think this is a serious new bug.
Also could you build a distribution registry to try push and pull for verification? https://github.com/distribution/distribution/blob/v2.8.3/BUILDING.md
are you saying I should push to vanilla Docker registry? I will be honest, it's going to work fine. Would it be helpful to see if it resolves the issue by patching it into Harbor instead? Is its image a drop-in replacement for the image of the harbor-registry deployment? It would seem so.
An Ingress configuration is possible but I routinely push other images. Something essential about this is that it is installing Python packages, which have many files and create many links.
FROM nvcr.io/nvidia/pytorch:24.01-py3 as builder ARG PIP_DISABLE_PIP_VERSION_CHECK=1 ARG PIP_NO_CACHE_DIR=1 RUN pip install wheel && \ pip install --no-build-isolation git+https://github.com/hiddenswitch/ComfyUI.git WORKDIR /workspace RUN comfyui --quick-test-for-ci --cpu --cwd /workspace EXPOSE 8188 CMD ["comfyui", "--listen", "--cwd", "/workspace"]
some Python packages have been removed from this Dockerfile. I've had this pushed just fine a few weeks ago. Between the not working and working versions, I upgraded to harbor 2.10. So it's most likely to be a new bug. The urgency is gone because I am using ECR for now, but I think this is a serious new bug.
HI @doctorpangloss ,
Thanks for provide all these feedback.
Although I take a try with the exact same dockerfile and using docker buildx
build and push multi-arch images to harbor v2.10.0 instance. Pulling success with both digest and tag. BTW I am just using docker compose installation.
Where are also facing this issue with docker.io proxy in our harbor with redis image
Hi @teimyBr ,
Thanks for connecting with us and appreciate if you could help to file a specific issue of your proxy-cache problem? including logs , harbor version etc..
redis/redis#13156 Where are also facing this issue with docker.io proxy in our harbor with redis image
Hi @teimyBr ,
Thanks for connecting with us and appreciate if you could help to file a specific issue of your proxy-cache problem? including logs , harbor version etc..
i think the issue was in redis image the new cha from redis fixed the issue
An Ingress configuration is possible but I routinely push other images. Something essential about this is that it is installing Python packages, which have many files and create many links.
FROM nvcr.io/nvidia/pytorch:24.01-py3 as builder ARG PIP_DISABLE_PIP_VERSION_CHECK=1 ARG PIP_NO_CACHE_DIR=1 RUN pip install wheel && \ pip install --no-build-isolation git+https://github.com/hiddenswitch/ComfyUI.git WORKDIR /workspace RUN comfyui --quick-test-for-ci --cpu --cwd /workspace EXPOSE 8188 CMD ["comfyui", "--listen", "--cwd", "/workspace"]
some Python packages have been removed from this Dockerfile. I've had this pushed just fine a few weeks ago. Between the not working and working versions, I upgraded to harbor 2.10. So it's most likely to be a new bug. The urgency is gone because I am using ECR for now, but I think this is a serious new bug.
HI @doctorpangloss , Thanks for provide all these feedback. Although I take a try with the exact same dockerfile and using
docker buildx
build and push multi-arch images to harbor v2.10.0 instance. Pulling success with both digest and tag. BTW I am just using docker compose installation.
then something is corrupting the layers on write. Very mysterious. I will try on a fresh harbor installation.
I ran out of bandwidth for this issue. I kind of need to know if harbor just writes the thing it receives. In other words, when a layer is pushed, is that content written directly, or is it read, modified, unpacked, etc. before it is written? For example, by some kind of layer scanning process? Then I would like to disable that process.
Hi @doctorpangloss , Harbor does not have any overwritten process for upload blob content, it will proxy to upstream distribution directly, i.e for content written to filesystem we have same behavior as distribution/distribution.
This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.
Hi @doctorpangloss , Harbor does not have any overwritten process for upload blob content, it will proxy to upstream distribution directly, i.e for content written to filesystem we have same behavior as distribution/distribution.
but does harbor patch distribution/distribution
? it seems like my issue is really a distribution
bug, and it does look really buggy., is there an alternative? aren't there pre-existing systems that support transactions for database-like and file-like operations together? what is AWS using internally for ECR?
This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.
This may be related https://github.com/microsoft/Windows-Containers/issues/519
I saw this occur once in ECR!
Related to #20133
This layer is from
It is largish (2.6GB). Not sure under what circumstances this should be occurring. There are no issues with the persistent volume / the underlying storage.
This happens repeatedly when the image is built.
This is a Windows image.
I feel like I am missing something, because I can't see how
registry
could be so widely used and get so far with this kind of issue.Expected behavior and actual behavior: When
registry
interacts with a blob, such as after writing it, it shouldsha256sum
the file to ensure it was written correctly.Steps to reproduce the problem:
Versions: Please specify the versions of following systems.
Additional context:
There is nothing notable in the logs. It's all just successful pushes.