ga4gh / task-execution-schemas

Apache License 2.0
80 stars 27 forks source link

TES and seccomp confinement #202

Open stxue1 opened 1 month ago

stxue1 commented 1 month ago

Is a TES task currently allowed to use all system calls available? If the host kernel has seccomp and the task container is ran with Docker, all these system calls are, by default, not allowed.

We want access to some of these syscalls for our WDL/CWL runners as it enables us to run Singularity within the TES task.

We saw that TES 1.1 has certain backend_parameters that are available, is there standard way of specifying these parameters or something similar to tell TES what permissions I would want for a task?

uniqueg commented 2 weeks ago

As far as I know, there is no TES way of requesting access to sys calls that are blocked by seccomp (if enabled). I am also sceptical about adding this to the TES spec in a way that would force all compliant implementers to support it (I guess seccomp is there for a reason). Possibly this could instead be specced out using extensions?

In any case, for this particular use case - would it make sense to add support for Singularity to Funnel? Also, could you possibly avoid the container-in-container design you are currently using by having your outer TES job sending TES jobs rather than spawning containers? Or perhaps making use of multiple executors (which may use the same volume)?

stxue1 commented 2 weeks ago

We don't currently have/test a Singularity implementation of Toil so I'm unsure if Singularity under Funnel would help, though I would think it might as it seems like its possible to run unprivileged singularity inside singularity. I'm unsure if the unprivileged nested container will cause any issues though.

We would like to eventually get rid of the Docker in Docker implementation, but it will be quite a bit of technical debt to get through. The easy way to get things working immediately is to have more backend privileges, as we are still unsure if getting docker in docker to work out of the box is possible. Avoiding the issue entirely is an option, though we would not be utilizing TES fully; it would work, but for things like WDL, only the bash scripts would run on nodes, and everything else on the leader.

I'm not sure using multiple executors would work. A Toil container will try to run a nested container whenever it wants. Maybe it is possible if we don't allow nested containers to run, and instead consult the batchsystem, though that would also take quite a bit of engineering to send and receive other containers.

uniqueg commented 2 weeks ago

Tricky problem indeed. I'm still not sure about a general addition to TES. If you know you are using Funnel, you could maybe work out something implementation-specific (e.g., via env vars) - with a view to a possible (optional) TES extension in the mid-term.