[RFE] Harden Flatcar Linux Defaults Further

krishjainx commented 1 year ago

Current situation

Flatcar Linux has sensible and robust default settings, which, when coupled with its minimalistic design, immutable filesystem, and automated update system, positions it as an appealing choice for various purposes while also offering ample potential for further hardening.

Impact

Flatcar could be further hardened without disrupting people's workloads. Below, I would like to share some of my thoughts that could serve as valuable points for discussion.

Ideal future situation

[ ] Harden default umasks
[ ] We can further harden the systemd services by default
[ ] Increase number of hashing rounds for passwords
[ ] Install a PAM module like passwdqc or cracklib
[ ] Include rngd
[ ] Harden permissions of home directories
[ ] Harden sysctls: dev.tty.ldisc_autoload, fs.protected_fifos , fs.protected_regular, fs.suid_dumpable, kernel.dmesg_restrict, kernel.kptr_restrict, kernel.modules_disabled, kernel.perf_event_paranoid, kernel.unprivileged_bpf_disabled, net.core.bpf_jit_harden, net.ipv4.conf.all.forwarding,net.ipv4.conf.all.log_martians, net.ipv4.conf.all.rp_filter, net.ipv4.conf.all.send_redirects, net.ipv4.conf.default.accept_redirects, net.ipv4.conf.default.log_martians, net.ipv6.conf.all.accept_redirects, net.ipv6.conf.default.accept_redirects
[ ] Harden SSH configuration
[ ] Setting up auditd on Flatcar currently is easy enough but we can consider enabling auditd by default.
[] TPM 2 and UEFI boot support (https://github.com/flatcar/Flatcar/issues/630) - working on this.

jepio commented 1 year ago

Are these suggestions based on trying to meet some security benchmark?

Can you add more specifics for the points with "harden" in their description. Harden how?

krishjainx commented 1 year ago

Certainly @jepio , I didn't want to add to many details in the first comment for brevity.

Umask can be 027
Run systemd-analyze security (we only have a number of default services running) and set the sandbox options that are appropriate
Set home directory permissions to 750
For SSH configuration set AllowTcpForwarding to No, ClientAliveCountMax to 2, LogLevel to VERBOSE, MaxAuthTries to 3, MaxSessions to 2, TCPKeepAlive to No, AllowAgentForwarding to NO
Sysctls
1. dev.tty.ldisc_autoload=0
2. fs.protected_fifos = 2
3. fs.protected_regular=2
4. fs.suid_dumpable=0
5. kernel.dmesg_restrict=1
6. kernel.kptr_restrict=2
7. kernel.modules_disabled=1
8. kernel.perf_event_paranoid=3
9. kernel.unprivileged_bpf_disabled=1
10. net.core.bpf_jit_harden=2
11. net.ipv4.conf.all.forwarding=0
12. net.ipv4.conf.all.log_martians=1
13. net.ipv4.conf.all.rp_filter=1
14. net.ipv4.conf.all.send_redirects=0
15. net.ipv4.conf.default.accept_redirects=0
16. net.ipv4.conf.default.log_martians=1
17. net.ipv6.conf.all.accept_redirects=0
18. net.ipv6.conf.default.accept_redirects=0

jepio commented 1 year ago

Which usecase would these options serve?

Some of these do not sound reasonable for a server+container OS. For example the ssh suggestions and at least the following sysctl suggestions would break container usecases:

kernel.modules_disabled=1
net.ipv4.conf.all.forwarding=0

The other suggestions might not be useful for the general population (rp_filter/redirects/logging), or might restrict rootless containers (which we're interested in supporting and improving).

umask/homedir/systemd-analyze sound like they could be OK.

krishjainx commented 1 year ago

Oops, didn't mean to write net.ipv4.conf.all.forwarding=0. For kernel.modules_disabled I'm unsure why that would break container use cases. That option just makes it so that modules are not allowed to be loaded. Is loading a kernel module that common of a use case?

t-lo commented 1 year ago

I think issues like this one - and PRs like https://github.com/flatcar/scripts/pull/933 - would benefit from an extended rationale and an elaborate threat model. While each setting's individual use could be discussed in depth I believe it's much more worthwhile to understand what we are defending against? External attackers? Network DDOS? A system service being exploited (which one?)? Malicious workloads in containers or on the host? Something else? Each of these need their very own specific defence-in-depth configuration which will come at great cost as it will put massive restrictions to "legal" workloads which are supposed to be running. Implementing multiple of these at once will render a node next to unusable for generic tasks and will only allow highly tuned and specialised workloads, putting a significant operational cost on the user (not to mention kill live clusters of users who use Flatcar's automated updates).

Both this issue and the PR raised above propose intrusive changes that will break existing workloads and impact widely used tools like Kubernetes / container networking managers like cilium and calico, service meshes like linkerd and istio, monitoring / instrospection systems like falco and inspector gadget and other eBPF-based tools - to name a few. We would also break systems that run unusual / specialised hardware if we restrict loading of kernel module drivers and firmware.

flatcar / Flatcar