Open yhgu2000 opened 1 week ago
@vsoch @cyphar @reidpr
For anyone curious, here's the OCI image (linux/amd64) that I got from docker save
:
Personally, I think the following 2 rules should be specified in the specification:
The tar
entries in the layer are extracted in order as normal tar
format, unless that .wh.
whiteout files should be applied first. The extraction starts from an empty root directory /
, and if any error occurs during extraction, the Image is considered as invalid.
If a subsequent layer overrides some paths that are hardlinks created in previous layers, only the files located by the paths are affected. They are recreated with the tar
entry data in the subsequent layer, instead of updating the existing inode
.
Hi, OCI. I was writing an OCI image parser, and quickly realized there's some serious undefined behaviors about hardlinks.
First, let's recall that a hardlink is a filesystem entry that actually points to the same "file" (
inode
) as another filesystem entry. So, modifying a hardlink can lead to implicit and unpredictable changes to other filesystem entries, which actually provides a mean of implicit communication. Treating hardlinks as independent normal files can cause runtime error if the application relies on the implicit communication assumption of hardlinks. Second, to remind all of us, OCI image layers are in thetar
format, e.g. POSIX pax/ustar/cpio standard, which allows hardlinks and duplicate paths.Indeed, there has been some content about hardlinks in current specification. But they are not enough to answer the following questions:
What if a layer contains an invalid hardlink, for example, pointing to an non-existent path? Should we consider the image as invalid or just ignore it?
When creating the filesystem bundle, what should we do if a subsequent layer has an entry that is a hardlink in previous layer (sharing an
inode
with many other filesystem entries)? Should we unlink the filesystem entry with the previousinode
and create a newinode
with the data in the new layer, or to update the existinginode
with the data (so that all hardlinked filesystem entries are affected)?When building the OCI image, how should it be recorded in the image if the user creates hardlinks to files of previous layer? In such case, the layer itself may be an error
tar
file, but can be extracted successfully under the condition that the previous layers are extracted in order.(I believe there are more problems with regard to the
tar
format. Comments are welcomed.)There has been an issue about hardlink and symlink: https://github.com/opencontainers/image-spec/issues/857 . But I believe it does not covers all the problems I list above here.
Personally, for question 3, I did an experiment with Docker. I write a simple static-linked C program that creates a copy, a hardlink, a symlink, and print their
inode
id:Then I build an image from scratch with the compiled C program:
When I run the image on the same machine that built it, here's the output:
We can see that
/b
is a hardlink to/a.out
, as expected.However, if I use
docker save
to dump the image into a.tar.gz
file, I find that the/b
entry in layer 3 actually has a type0
, which means it is stored as a normal file, instead of hardlink. To further validate my suspicion, I copy the.tar.gz
file to another machine with Docker, and the result is:This means
/b
is now a regular file, which is not expected, or it is? Anyway, this example indicates that even Docker is confused with such situation.