Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator support including CUDA GPU, ROCm GPU, TPU, IPU and other NPUs.
Now we can implement get_additional_syscalls() in the accelerator implementation to allow additional system calls in the containers.
[!TIP]
I think it would be a good idea to automate the process of checking if the default-seccomp.json profile is up to date and updating it through CI.
Although it is not directly related to this PR, I think referring to the PR below will be helpful for automating the update of the default-seccomp.json file.
https://github.com/lablup/backend.ai-jail/pull/18
Let's try blocking some essential system calls for session creation in the default-seccomp.json file.
Then, session creation will fail as shown below.
❯ ./backend.ai session create python
✗ Session ID 1c8844a4-0e41-4b3b-9098-1dab7cdc97e9 has an error during scheduling/startup or cancelled.
Next, let’s implement the get_additional_syscalls() method in the CUDA MockPlugin class to return the blocked system calls.
Lastly, let’s try creating the session again with the mock plugin resource options. This time, we could see that the session creation is successful.
❯ ./backend.ai session create -r cuda.shares=1 python
∙ Session ID 81dc4784-2271-48a6-a512-a3342032f53b is created and ready.
∙ This session provides the following app services: sshd, ttyd, jupyter, jupyterlab
Checklist: (if applicable)
[x] Milestone metadata specifying the target backport version
Resolves #2931.
Now we can implement
get_additional_syscalls()
in the accelerator implementation to allow additional system calls in the containers.Reference
default-seccomp.json
is copy-pasted from https://github.com/moby/moby/blob/master/profiles/seccomp/default.jsonHow to Test
We can test this PR using the following method.
Next, let’s implement the
get_additional_syscalls()
method in the CUDAMockPlugin
class to return the blocked system calls.Lastly, let’s try creating the session again with the mock plugin resource options. This time, we could see that the session creation is successful.
Checklist: (if applicable)