Identify tasks' domain - Githubissues

l0kod commented 8 months ago

It would be useful to identify the Landlock domain restricting threads for tests and auditing purpose. To make easy to get this information, we could create a /proc/<pid>/attr/landlock/current entry containing a Landlock domain ID.

Dealing with IDs may be challenging in the context of Landlock because of the unprivileged constraints (e.g. confidentiality: don't leak information about other domains). We should also keep in mind CRIU that should be able to recreate the same IDs without too much trouble.

I think the most promising approach is to rely on the thread ID that restricts itself. We can rely on struct pid and get_task_pid() to safely manage Landlock domain IDs. This way, we can have the guarantee that the domain's ID is tied to the domain's lifetime, which is a superset of the sandboxed tasks' lifetime. It should also be a superset of the /proc/<pid>/attr/landlock/current file descriptor's lifetime. In the future, we could also use the related file descriptor to reference a domain (instead of with the raw ID value).

Because a thread can sandbox itself several times (up to 16), we also need a version to make such domain ID unique at a given time. The content of /proc/<pid>/attr/landlock/current could then look like 123.0

Because TIDs are tied to PID namespaces and then the proc filesytem, it would be a good fit to use the /proc/<pid>/attr/landlock/current interface. This should also make it compatible with the new lsm_get_self_attr() system call.

These domain IDs must also be part of audit logs: #3

praveen-pk commented 8 months ago

Because TIDs are tied to PID namespaces

Will the TIDs used here be host visible ones and not the ones within the PID namespace? If these TIDs are host visible ones then we can rest assured that they will be unique per process.

Secondly, does this design allow checking if a thread is sandboxed from within and outside the PID namespace?

l0kod commented 8 months ago

Because TIDs are tied to PID namespaces

Will the TIDs used here be host visible ones and not the ones within the PID namespace? If these TIDs are host visible ones then we can rest assured that they will be unique per process.

A challenge with IDs, and especially ones available to unprivileged processes, is that they must not leak information (e.g. the number of existing or created Landlock domains). Tying domain IDs to PID namespaces make them relative to the process reading such value. The same way, a /proc filesystem doesn't show the same thing according to the PID namespace in which it was mounted. The same Landlock domain could then have different IDs according to the /proc in which it is read.

The idea is to see the same things as with a PID: a process will see a domain ID with value 0.0 if the thread that created this domain was in a parent or sibling namespace (I guess what you call "host"), or it will see a useful value 123.0 if this thread was in the same or a nested PID namespace (e.g. the watcher process being in the "host"). Most of the time, legitimate processes looking at these values will be either in the same or a parent PID namespace of the thread that created the Landlock domain under review.

The audit log will contain a domain ID value according to the initial PID namespace, so never 0.0

Secondly, does this design allow checking if a thread is sandboxed from within and outside the PID namespace?

It allows a process A with a /proc mount point X to check if another process B is sandboxed if the thread C that initially sandboxed itself (e.g. it may be a parent of B, or C itself) is visible in the PID namespace of X (i.e. the namespace of the process that mounted X).

l0kod commented 1 month ago

The /proc/<pid>/attr/ interface is now deprecated. Instead, following the same approach as pidfd's IOCTL to get namespace file descriptors (with appropriate permission checks), we could implement a new command to get a Landlock domain's (read-only) file descriptor. This FD could then be used to get the domain ID with an IOCTL command (race-condition free, relative to the calling task, and compatible with kcmp(2)), and it would also be useful for future use cases (e.g. copy into a new ruleset, control cross-domain accesses). We could then have two IOCTL commands: one to get the absolute/audit ID (requiring CAP_AUDIT_READ), and another to get the relative ID (unprivileged, tied to the PID of the initial restricted task and relative to the current PID namespace). Both values would be 64 bits (e.g. PID << 32 | version).

l0kod commented 3 weeks ago

Here is a new proposal for the ID generator: https://lore.kernel.org/all/20241022161009.982584-5-mic@digikod.net/

Instead of managing absolute and relative IDs, only use absolute IDs that are OK to be exposed to unprivileged processes.

These IDs have important properties:

They are unique during the lifetime of the running system thanks to the 64-bit values: at worse, 2^60 - 2*2^32 useful IDs.
They are always greater than 2^32 and must then be stored in 64-bit integer types.
The initial ID (at boot time) is randomly picked between 2^32 and 2^33, which limits collisions in logs between different boots.
IDs are sequential, which enables users to order them.
IDs may not be consecutive but increase with a random 2^4 step, which limits side channels.

This approach is more secure than other kernel IDs such as socket's inodes.

l0kod commented 2 days ago

We can now extend the new PIDFD_GET_INFO IOCTL.

landlock-lsm / linux

Identify tasks' domain #26