Closed evansd closed 2 years ago
Assuming that we'll mount the working directory into /workspace
here one issue this raises is permissions. All files created by docker will end up owned by the wrong user (root at the moment). This means that when you then try to run actions via opensafely run
you have the possibility of hitting permissions errors. One way of working around this would be to set ENABLE_PERMISSIONS_WORKAROUND
config option in local run mode. This was originally designed to get us out of hole in the Windows production environment and I sort of assumed we'd be able to abandon it one day. But maybe this will be useful for us longer term.
The permissions issue is a challenge.
On ~unix, we could fix this by doing a local docker build
to modify uids to match the host's user (like we do for dev docker stuff), and using that local image to run the command. But that is horrible. We could also look at userns-remap
in docker, but that is also horrible. We could maybe do sshfs, also horrible.
For windows, I think the situation is actually better. The permissions would only be wrong for files that were created by the docker process, not files that already existed and then shared with docker. Also, for local use, we may be an admin user, and can fix up the permissions of any new files after docker exec
.
We can "just" use the --user
argument to docker run
to specify the uid and gid although the corresponding user and group might not exist in the container. The extent to which that's a problem depends on exactly what we're running but given that we control these images that might be OK.
Alternatively it's possible we could do something with the entrypoint script like have it accept a TARGET_UID env var and then get it to create and switch to that user before running anything else. Obviously that's also horrible but maybe doing it on the fly like this is less horrible than having the pre-build another image with a new user. I dunno...
Hmm, yeah the situation is different for local run, which means we might be able to do it on the fly.
So, our action images would have a user opensafely
with a hardcoded UID (like we do with release-hatch and job-runner images). In prod, we run as --user $HARDCODED_UID.
, and everything is all the same uid on host and container, which is not root, and we happy.
However, in local run, we can instead do something like docker run --user 0 -e TARGET_UID=$(id -u) ...
, and the entrypoint can usermod -u $TARGET_UID opensafely
and su opensafely -c ...
because it is root.
That would work nicely for regular opensafely run
as well as opensafely exec
.
Put into the ideas bucket
Initial implementation in:
I don't think this can go in the ideas bucket, or at least, it needs to be finished off fairly promptly, and its not a lot of work.
@evansd added this to scratch an development itch, but it's now available to users. So we need to finish it off sooner rather than later, or else users may start using an undocumented half finished feature.
There has to be away for us to schedule small improvments as we encounter them, without requiring a full work pipeline discovery, or getting queued up behind bigger initiatives, and which point we might as well just say we'll never get to it.
Something this size is very unlikely to be prioritized over strategic things.
How much work, roughly, is remaining on this? Is it hours, days or weeks?
Is it important for databuilder or just generally when working using opensafely?
This sounds like it's no longer a candidate for an initiative, as it's likely to now be too small. We can't only work on "strategic things", we need to make room somehow for small improvements, maintenance tasks, etc. I'm asking the above questions so I can understand how it compares with the other non-initiative work we have in the backlog.
The implementation is maybe 3 hours, probably less. The testing and possibly fixiing of each image to make sure they work with this ok is maybe a couple more hours. Documenting it maybe a another couple with review.
So it's maybe about 8 hours work total.
The primary use case is developers and advanced users, but it opens the way to plug our docker envs into more tooling down the line
Also, I got confused as I thought this was the issue Dave had just filed, but had been immediately closed and moved to future work!
Probably should have this discussion over there! #149
Strawman:
This is little more than a thin wrapper around docker to supply the right args and prefix the right container registry. (Oh, and download the appropriate image too if necessary.)
Having this is important so that people can do things like access help pages (as above) and also to provide a fast feedback loop for getting the right syntax for
run
commands to put inproject.yaml
.