apache / buildstream

BuildStream, the software integration tool
https://buildstream.build/
Apache License 2.0
79 stars 25 forks source link

BuildStream 2 artifacts lose the unix write permission bit for the user (`u+w`) #1790

Open doraskayo opened 1 year ago

doraskayo commented 1 year ago

Context

As far as I understand, BuildStream is supposed to preserve unix file permissions to a certain extent. However, the user write permission bit seems to be lost from files when checking out built artifacts. This issue is not seen with bst16.

Description

When checking out artifacts using --hardlinks in bst2, it seems that the u+w permission is removed from every file (or almost every file?) compared to the permission for that file when it was originally collected as part of bst build.

From limited testing with the Flatpak repo built in the freedesktop-sdk project, it appears that only the user write bit disapears, while all other unix permission bits are retained.

Slightly unrelated to this, and potentially a different issue is when checking out artifacts without --hardlinks, bst2 appears to set file permissions according to umask, resulting in file permissions which are yet again different from the file permissions that were present when the files were originally collected.

Impact

Among other issues, this behavior leads OSTree repos which are collected into artifacts to be considered corrupted by ostree fsck. This is because OSTree repo include file permissions in their hash calculaition for objects, so this file permission change is detected as a file a corruption.

This file permission change and its effect on OSTree repos was found to be reason why the freedesktop-sdk project couldn't publish its Flatpak repo to its remote release server since moving from bst16 to bst2 in the master branch.

The only workaround found for this issue was to check out artifacts without --hardlinks and with a specific umask that provides the expected file permissions. However, this results in massive disk I/O overhead for the freedesktop-sdk project.

References:

cc: @abderrahim, @gtristan, @nanonyme

nanonyme commented 1 year ago

@juergbi we were discussing this with Slack. Seems some manner of reflinks would be the ideal solution, of course reflinks are not generally available.

nanonyme commented 1 year ago

The obvious first approach is fixing expectations through documenting current behaviour for permissions. https://github.com/apache/buildstream/issues/1792 I think we really should be supporting reflinks though. There are production-ready filesystems with reflinks support like XFS for long time already.

nanonyme commented 1 year ago

The reflinks support for checkout should be considered lower priority than seeing if we can improve FUSE performance though. Even without doing anything special Python standard library will use sendfile to copy the data which is reasonably fast. It still results in a lot unnecessary hardware wear per CI run though.

gtristan commented 1 year ago

Good catch, we should definitely make this clear in the bst artifact checkout text regarding --hardlinks behavior.

If we add any additional support with reflinks, then we probably need a new CLI opt for that in order to ensure that behavior of existing API does not unexpectedly change.

nanonyme commented 1 year ago

@gtristan coreutils cp and in future also Python think reflinks is really sensible thing to try by default. I suggest we don't add further options to command-line, it's cluttered as is. Rather, we add user configuration option for this with values

auto is sensible default.

nanonyme commented 1 year ago

Note this is a big regression to us that we can no longer use checkout with --hardlinks. It adds up to 30-45 minutes extra build time for us and contributes to hardware wear. It's apparently ballpark similar extra cost as what we get due to FUSE for flatpak_repo (compared to bst1).

nanonyme commented 1 year ago

There is also an unrelated bug. bst checkout must be capable of replicating original permissions from when files were put to install-root. This functionality may be selectively enableable. (it is fine if we get a primitive so we can enable permission replication for flatpak_repo and other custom elements only)