Open l0kod opened 8 months ago
Because TIDs are tied to PID namespaces
Will the TIDs used here be host visible ones and not the ones within the PID namespace? If these TIDs are host visible ones then we can rest assured that they will be unique per process.
Secondly, does this design allow checking if a thread is sandboxed from within and outside the PID namespace?
Because TIDs are tied to PID namespaces
Will the TIDs used here be host visible ones and not the ones within the PID namespace? If these TIDs are host visible ones then we can rest assured that they will be unique per process.
A challenge with IDs, and especially ones available to unprivileged processes, is that they must not leak information (e.g. the number of existing or created Landlock domains). Tying domain IDs to PID namespaces make them relative to the process reading such value. The same way, a /proc
filesystem doesn't show the same thing according to the PID namespace in which it was mounted. The same Landlock domain could then have different IDs according to the /proc
in which it is read.
The idea is to see the same things as with a PID: a process will see a domain ID with value 0.0
if the thread that created this domain was in a parent or sibling namespace (I guess what you call "host"), or it will see a useful value 123.0
if this thread was in the same or a nested PID namespace (e.g. the watcher process being in the "host"). Most of the time, legitimate processes looking at these values will be either in the same or a parent PID namespace of the thread that created the Landlock domain under review.
The audit log will contain a domain ID value according to the initial PID namespace, so never 0.0
Secondly, does this design allow checking if a thread is sandboxed from within and outside the PID namespace?
It allows a process A with a /proc
mount point X to check if another process B is sandboxed if the thread C that initially sandboxed itself (e.g. it may be a parent of B, or C itself) is visible in the PID namespace of X (i.e. the namespace of the process that mounted X).
The /proc/<pid>/attr/
interface is now deprecated. Instead, following the same approach as pidfd's IOCTL to get namespace file descriptors (with appropriate permission checks), we could implement a new command to get a Landlock domain's (read-only) file descriptor. This FD could then be used to get the domain ID with an IOCTL command (race-condition free, relative to the calling task, and compatible with kcmp(2)
), and it would also be useful for future use cases (e.g. copy into a new ruleset, control cross-domain accesses). We could then have two IOCTL commands: one to get the absolute/audit ID (requiring CAP_AUDIT_READ
), and another to get the relative ID (unprivileged, tied to the PID of the initial restricted task and relative to the current PID namespace). Both values would be 64 bits (e.g. PID << 32 | version
).
Here is a new proposal for the ID generator: https://lore.kernel.org/all/20241022161009.982584-5-mic@digikod.net/
Instead of managing absolute and relative IDs, only use absolute IDs that are OK to be exposed to unprivileged processes.
These IDs have important properties:
This approach is more secure than other kernel IDs such as socket's inodes.
We can now extend the new PIDFD_GET_INFO
IOCTL.
It would be useful to identify the Landlock domain restricting threads for tests and auditing purpose. To make easy to get this information, we could create a
/proc/<pid>/attr/landlock/current
entry containing a Landlock domain ID.Dealing with IDs may be challenging in the context of Landlock because of the unprivileged constraints (e.g. confidentiality: don't leak information about other domains). We should also keep in mind CRIU that should be able to recreate the same IDs without too much trouble.
I think the most promising approach is to rely on the thread ID that restricts itself. We can rely on
struct pid
andget_task_pid()
to safely manage Landlock domain IDs. This way, we can have the guarantee that the domain's ID is tied to the domain's lifetime, which is a superset of the sandboxed tasks' lifetime. It should also be a superset of the/proc/<pid>/attr/landlock/current
file descriptor's lifetime. In the future, we could also use the related file descriptor to reference a domain (instead of with the raw ID value).Because a thread can sandbox itself several times (up to 16), we also need a version to make such domain ID unique at a given time. The content of
/proc/<pid>/attr/landlock/current
could then look like123.0
Because TIDs are tied to PID namespaces and then the
proc
filesytem, it would be a good fit to use the/proc/<pid>/attr/landlock/current
interface. This should also make it compatible with the newlsm_get_self_attr()
system call.These domain IDs must also be part of audit logs: #3