dotnet / sdk-container-builds

Libraries and build tooling to create container images from .NET projects using MSBuild
https://learn.microsoft.com/en-us/dotnet/core/docker/publish-as-container
MIT License
179 stars 39 forks source link

Better layer packing approach using `TarFile.CreateFromDirectory`? #160

Open rainersigwald opened 2 years ago

rainersigwald commented 2 years ago

@rainersigwald I know this was before the PR, but I'm not understanding this comment clearly: Is /app being prefixed or suffixed to the file path?

I ask because if it's a prefix, and it needs to be removed, we could try using the stream-based TarFile.CreateFromDirectory with includeBaseDirectory=false, see if that works.

If that's not what the comment meant, then nothing to do here.

_Originally posted by @carlossanlop in https://github.com/dotnet/sdk-container-builds/pull/153#discussion_r960857319_

rainersigwald commented 2 years ago

When taking tarballs and making them into a filesystem, the root of the tarball becomes / inside the filesystem. Dockerfiles use absolute paths like /app, but when packing that it becomes a relative path inside the tarball app/. I want to match that behavior for familiarity.

It sounds like I accidentally reimplemented TarFile.CreateFromDirectory with includeBaseDirectory=false, so switching to that makes a bunch of sense for the current implementation, but I'm not sure if it'll work once we start tweaking file attributes to solve #34. Do you think it would be helpful?

baronfel commented 2 years ago

In addition to #34, I think the requirement to set the user/group information to support #148 is going to require hand-rolling the TarEntry instances for files and directories, so the convenience methods for files/directories become even less of a fit for us.

carlossanlop commented 2 years ago

I think the requirement to set the user/group information to support

TarFile.CreateFromDirectory should add gname/uname for you if you're creating the archives in Unix. This is because the entries are stored in PAX format.

If you want to do more fine tuned modifications, then yeah, I would suggest manual iteration using TarReader and TarWriter.

baronfel commented 2 years ago

Unfortunately we won't always be creating container archives on Unix. VS publishing is a key scenario that we need to be able to support, for example.

carlossanlop commented 2 years ago

Oh I see. then in that case, you'll have to manually create the PaxTarEntry instances, manually write the file contents into DataStream, then set the GroupName/UserName property values manually, and then pass them to the TarWriter instance.

baronfel commented 2 years ago

Awesome, that matches up with what I was doing in my experimental branch.

One interesting thing I discovered is the behavior of either docker, or our Tar APIs, or the Tar spec(s) when you do not explicitly write a directory entry into the archive: docker will create a directory on the file system, but it will create it with root:root permissions instead of whatever permissions are on the files themselves. I'm not sure at which level this behavior is being introduced, but the good news is that explicitly setting the user in group metadata and explicitly creating directories solves the problem.

carlossanlop commented 2 years ago

We recently merged a fix to main (8.0) to address some directory permission problems: https://github.com/dotnet/runtime/pull/72078 (issue here: https://github.com/dotnet/runtime/issues/71140)

Would that fix solve the problem you describe? You mentioned you have the uname/gname workaround to help with your issue, but would you prefer having the permissions PR backported to 7.0 so you can use it? We still have runway, we just need to ask for permission (and if an internal partner asks for it, it might meet the bar).

carlossanlop commented 2 years ago

Ah, this other fix is closely related too: https://github.com/dotnet/runtime/pull/74002

Edit: This one got backported to 7.0-RC1: https://github.com/dotnet/runtime/pull/74277

baronfel commented 2 years ago

Perhaps? I'll take a detailed look later today and see if I can form an actual opinion 🤣

I think that we may need to still have the hands on access for the general case here though: users will eventually be able to set their own container users and groups, and as soon as they do that we're back in manual-metadata-writing land again, regardless of the directory fixes.

carlossanlop commented 2 years ago

I think that we may need to still have the hands on access

Fortunately, we introduced in 7.0 the UnixFileMode enum to make it easy to set the Mode for a tar entry and files in general.