Open smira opened 9 months ago
Protecting stuff is one thing another problem right now is that they all share the same filesystem. That means if we use the local path provisioner and consume all the space, etcd will crash due to out of disk space errors. In my old k3s setup I used to have lvm volumes for eauch of the consumers like etcd, longhorn, local-path and so on....
Problem Statement
At the moment
/var
(also known asEPHEMERAL
partition) doesn’t have any specific structure: users are allowed to create mount points, put user files at random locations under/var
.For the pod
hostPath
mount to work properly with all the features supported by thekubelet
, mount path should be available in thekubelet
mount namespace same way as in the host namespace. This requires manual and non-obvious configuration.For the external mounts (e.g. NFS) to work properly if mounting is done from the
kubelet
, the mount path should be done in thekubelet
.Talos doesn’t offer a way to put user files really ephemeral (i.e. using
tmpfs
), so that reboot is enough to clean up the state.Talos doesn’t support full reconciliation for
machine.files
key, as contents of the/var
are not known, and the effect of removing a value frommachine.files
is not clear.There’s no way to remove parts of the
/var
(e.g. if some directory was created by mistake).Some critical or system-important parts of
/var
are not protected from simple mistakes (e.g. creating a wronghostPath
mount under theetcd
data directory).What’s in
/var
?lib/etcd
- etcd data directory (only controlplane nodes)system/overlays
- Talos internal path foroverlayfs
mounts (probably almost all can be migrated totmpfs
with small exceptions)log
- pod logs, API server audit logs, etc.lib/containerd
- CRI containerd state (container workspaces)lib/kubelet
- kubelet statelib/cni
- CNI state (???)run
- various ephemeral things (should be intmpfs
)Proposal
etcd
Make sure
etcd
data directory is only accessible byetcd
itself (and, Talos itself for the purposes of backup/restore). No other workload should be able to access theetcd
data ever.E.g. we could use
SELinux
, which will protectetcd
from other workloads while it can also protect workloads from accessingetcd
.kubelet
Mostly same thing as
etcd
, we should look into protecting data directory from other workloads. Askubelet
makes a lot of random access, it’s hard to containkubelet
itself from accessing other directories.Logs
We can look into making sure other workloads have read-only access to the logs, while
kubelet
(?) can write the logs.run
Should we make this
tmpfs
(if not already?)containerd
Not much we can offer, as workloads write to the container scratch space.
overlays
This is a Talos-specific location, and we shouldn’t allow random writes there (
overlayfs
upperdir, workdir). We should look into minimizing the overlays on/var
(we could replace with overlays on tmpfs when it makes sense)./var/mnt
Introduce new directory (naming TBD) which serves a root mount point for:
hostPath
mounts,local-path-provisioner
default path, etc.This path is mounted as
rshared
into thekubelet
container, so that mounts both ways (from the kubelet to the host, from the host to the kubelet) are visible.Users are supposed only to use this
hostPath
for such mounts.Questions:
machine.files
We need to split it into the usecases for this feature:
.machine.pods
better way), deprecated now/var/mnt
is mounted to the kubelet?)tmpfs
location (easier to prune with reboot)/var
)In general,
machine.files
should work on top of the controller.API to Remove File(s)
Should be restricted to work under
/var/mnt
only.Benefits
etcd
), and also making sure a service doesn’t have access to the data it shouldn’t have access to./var/mnt
)machine.files
containerd
state gets corrupted on a single controlplane node, it can be pruned without touchingetcd
data.upgrade
we can prunecontainerd
state and overlay mounts, but keep/var/mnt
for hostPath persistence.