paritytech / cachepot

cachepot is `sccache` with extra sec, which in turn is `ccache` with cloud storage
https://cachepot.cc
Apache License 2.0
171 stars 13 forks source link

Support running dist compilations in unprivileged scenarios #128

Closed Xanewok closed 2 years ago

Xanewok commented 2 years ago

Closes #114 Closes #10 Closes #9

Xanewok commented 2 years ago

I'll mark this as ready for review to trigger gitlab runs, sorry for the noise.

Xanewok commented 2 years ago

So far so good, the dist test suite on gitlab is passing 2/4 and we need bwrap available to progress further, so that's blocked on paritytech/scripts#368 and updated image deployment for now IIUC.

montekki commented 2 years ago

Merged with latest master and fixed CI (almost). I believe that remaining failing targets will be fixed when the logging targets are changed back to info.

The rootless feature seems to work, at least locally for me.

The TODOs and FIXMEs have to be fixed before merging.

So what else has to be done here?

Xanewok commented 2 years ago

I've fixed some of the outstanding issues and leftover debugging bits that I've sprinkled around.

So what else has to be done here?

One key point that I wanted to address is that the current approach was a proof of concept. Because we need to unshare the user namespace in the main thread, I opted for a fork here, which is technically not safe (see the note about it not being async-signal-safe), so that I can conveniently call it from a forked process (the forked thread is turned into the child's main thread). I tried to move out some unsafe functions (e.g. anything that allocates) out of the fork-child block but this approach not bullet-proof and more of a best effort.

To solve that, I wanted to separate another entrypoint for our compilation (think cachepot-dist sandbox) and call that instead. Additionally, we should consider introducing some kind of a run-time switch for the sandboxing implementation (like CACHEPOT_SANDBOX={bubblewrap, userns}), defaulting to the old bubblewrap logic, to err on the safe side.

However, if the test suite agrees, I'd prefer to merge this now and work incrementally from there and possibly do a point release (or an RC, cc @drahnr) to relieve the shipping pressure, which would help unblock me mentally.

Xanewok commented 2 years ago

I'm getting this error on our GitLab run:

2022-02-03T01:17:22.121 TRACE [PID 5079] dist worker Waiting on child bubblewrap build with PID 5456
2022-02-03T01:17:22.123 WARN  [PID 5456] dist worker Couldn't set up a build environment for bubblewrap: Failed to mount overlay FS: overlayfs "/tmp/cachepot_dist_testbXfSR7/distsystem/server-builddir/toolchains/12553bee8311c582858a47c1e186c21a69f1c77fe26bbe6628549c4eca67803f",upperdir="/tmp/cachepot_dist_testbXfSR7/distsystem/server-builddir/builds/12553bee8311c582858a47c1e186c21a69f1c77fe26bbe6628549c4eca67803f-1/upper",workdir="/tmp/cachepot_dist_testbXfSR7/distsystem/server-builddir/builds/12553bee8311c582858a47c1e186c21a69f1c77fe26bbe6628549c4eca67803f-1/work" -> "/tmp/cachepot_dist_testbXfSR7/distsystem/server-builddir/builds/12553bee8311c582858a47c1e186c21a69f1c77fe26bbe6628549c4eca67803f-1/target": Operation not permitted (os error 1) ("/tmp/cachepot_dist_testbXfSR7/distsystem/server-builddir/toolchains/12553bee8311c582858a47c1e186c21a69f1c77fe26bbe6628549c4eca67803f": exists, upperdir: exists, workdir: exists, same-fs, target: exists, regular-user)

which can be parsed to

Failed to mount overlay FS: overlayfs (...) Operation not permitted (os error 1)

What comes to mind is that we use 5.10 kernel but native overlayfs support in the user namespace was allowed in 5.11 (and 5.13 if we count the SELinux bug fix).

@rcny do you think it would be feasible to run these using the slightly newer kernel or should we look towards fuse-overlayfs for the time being?

Xanewok commented 2 years ago

And it seems that for a change the GHA job does not like us messing with gid_map:

Couldn't set up a build environment for bubblewrap: Failed writing to gid_map

but it has no problems if we write to setgroups and uid_map?

rcny commented 2 years ago

@rcny do you think it would be feasible to run these using the slightly newer kernel or should we look towards fuse-overlayfs for the time being?

Opened the issue about updating kernel to 5.15.x LTS.

Can you go with fuse-overlayfs for now? I'm not quite sure when someone from the team would be able to pick the task.

Xanewok commented 2 years ago

Can you go with fuse-overlayfs for now?

We talked on Element and decided to move with fuse-overlayfs for now until someone updates the kernel to 5.15 in the meantime.

Xanewok commented 2 years ago

Squashed and rebased on top of now merged #140 to clean up the history; I'll work on the fuse-overlayfs and the run-time switch for the sandbox mentioned above.

Xanewok commented 2 years ago

I implemented a feature gate behind the CACHEPOT_SANDBOX=userns environment variable - if it's unset or set to anything else, we fall back to using the regular, non-namespaced approach used before. If it's set, however, it uses the fork-then-unshare approach. I tried to simplify the implementation by using only a single UDS and by serializing the entire process output to help me abstract over a simple fn(CompileCommand) -> Result<BuildResult> build function.

I tried to minimize the diff - it might be helpful to run git diff -w to ignore the whitespace changes.

Xanewok commented 2 years ago

Okay, this is now really ready to review - I noticed a bug has slipped in and since fixing it in 67bfbee96318b66c88dc57e7516cd4b68acbf7b8 I have managed to get GHA working using user namespaces, so at least we'll have some CI coverage for that feature :tada:

~GitLab test is marked as allow_failure but marks the GitHub suite incorrectly as failing - should I just remove GitLab test for now until we update the kernel/implement fuse-overlayfs?~

EDIT: Removed GitLab for now, will bring it back once I close #143

Xanewok commented 2 years ago

Added links and error log in fd3e3bd22fdba7b83684b2367ca9156109efe108