Open ppmathis opened 3 months ago
So, I've been curious about the root cause and spent some time analysing this myself, and came to the conclusion that the root cause is more on the apko side, but there are multiple reasons why this is failing as of today. I first started to analyze the situation when melange builds a tarball:
melange build
, a temporary directory is mounted to /home/build
, and this is where the first issue happens. To avoid permission issues on the host systems, Docker Desktop is not preserving the actual UID/GID, and instead lets that default to the user account on the host, but uses an xattr named com.docker.grpcfuse.ownership
which will contain a JSON value like {"UID":123,"GID":123,"mode":770}
, which stores the real ownership and permissions.501:20
leaking into the tarball itself.memFS
implementation used by apko ignores the UID/GID values from tar headers completely and calling memFS.Chown()
would be necessary to preserve permissionsI was able to get everything working correctly on apko's side by:
tarball.Option
which enables the processing of xattr-based permissions for gRPC FUSEContext.writeTar
to check for this option and, if enabled, read the xattr for every entry being processed and set the UID/GID/Mode solely based on this xattr. If this xattr is missing or parsing fails, the UID/GID get reset to 0, so the host user does not leak through.memFS.WriteHeader
to call memFS.Chown()
for both directories and files if either UID or GID is != 0 in the header of the entry being processedYou can find both of these changes in my forked apko branch fs-ownership
with these two commits:
Once apko has been extended with the above two commits, melange itself can be adjusted to use the gRPC FUSE ownership option based on its environment, as seen on the fs-ownership
branch in my melange fork:
I did not implement any heuristics / detection for this yet, as it's simply meant as a PoC, but I suppose it could be based on the runner environment - e.g. if Docker is used on macOS, then use gRPC FUSE based permissions. With these changes to both apko and melange I'm finally able to build both a clean APK tarball (preserving custom permissions, while not leaking host account UID/GID through) and a clean OCI image with proper permissions as well.
Now my question: Would changes like these be acceptable to the melange/apko team? Or has this not been implemented on purpose so far for some reason unknown to me?
I've also tested this out on a Linux system by now, to have the full picture in terms of ownership support. Here is a summarised breakdown of everything:
com.docker.grpcfuse.ownership
. Would require heuristics to determine an environment like this, and then forcibly use only the xattr for UID/GID.The question that remains on my side is why apko does not preserve ownership (implementation is trivial, as shown in my previous comment with https://github.com/ppmathis/apko/commit/dd8c452853fa2686f5c7ad2beac29b06daf614fa) and instead forces 0:0. The melange bug on macOS makes sense, it's a special case, but why would apko ignore ownership?
The question that remains on my side is why apko does not preserve ownership
If I had to guess, it's not deliberate, just an oversight. Since apko
has the ability to set ownership explicitly, I suspect we've been doing this at the image level and not at the apk level.
Feel free to PR that change to apko. It would be good to have a test that checks this and ensures it's consistent with /lib/apk/db/installed
as well.
This is probably somewhat related to the fixed and closed #501, but in my own attempts to use melange + apko together it was impossible for me to build a package with melange where certain paths have custom ownership. Based on my current understanding, the issue should be within melange and apko is not to blame, but just to be sure, I included a full example as a reproducer.
My first attempt was running the latest Podman Desktop release on macOS 14.5 (MacBook Pro M1), but there I was not even able to use
chown
within the melange pipeline. While the commands ran successfully, the changes have not even been persisted within the sameruns
step, meaning thatchown
followed byls
(as done in[1]
and[2]
in the reproducer) was never showing any changes. I think the file system used during builds was somehow not able to work together with Podman, so I put the blame on the differences to Docker, especially since I do not currently have a deep understanding of the Melange build process.My second attempt, after removing Podman completely from the system, was to instead use Docker Desktop. Here the initial attempt looked way more promising, as running
ls
afterchown
orchmod
now properly shows the changes, both for[1]
and[2]
in the melange reproducer config.Unfortunately, the changed ownership is not being reflected in the final output of the melange tar archive. I read through the previously linked issue and saw the adjusted tarball emitter for the data archive, so I would have assumed that
tar --list --numeric-owner -tf packages/aarch64/ownership-1.0.0-r0.apk
will show the proper UIDs/GIDs, except for the build user with1000:1000
, which should be remapped to UID/GID0:0
, but I get this output instead:The UID/GID combination of
501:20
matches the UID/GID on my MacOS host system, where this happens to be the primary and active user. If I install this package using apko for building a container image from it, all ownership information seems lost:I also double-checked the tar archive from apko with
dive
, but had the same findings - there is simply no ownership information present on these files anymore, but the permissions were kept.Based on my understanding, I would consider this a bug and it does not match my own expected behaviour. While I can use
paths
in apko, it seems like a bad practice to do so, as these paths are concerned to the package itself, and when e.g. combining multiple packages as a service-bundle in apko, I do not want to repeat package-specific path configs across multiple files.Version Info
melange
apko
Docker
Reproducer
Steps
1)
melange keygen
2)melange build --signing-key melange.rsa
3)apko build -k melange.rsa.pub apko.yaml ownership ownership.tar
4)docker load < ownership.tar
5)docker run -it --rm ownership:latest-arm64 ls -lan /ownership
6) Optionally inspect with further tools, e.g.dive --source docker-archive ownership.tar
While the permission changes with
chmod
are respected, the ownership changes are lost, and both/ownership/file
as well as/ownership/dir
show up as being owned by root.melange.yaml
apko.yaml