SELinuxProject / selinux

This is the upstream repository for the Security Enhanced Linux (SELinux) userland libraries and tools. The software provided by this project complements the SELinux features integrated into the Linux kernel and is used by Linux distributions. All bugs and patches should be submitted to selinux@vger.kernel.org
Other
1.33k stars 359 forks source link

libselinux: reclaim memory after calling selinux_init_load_policy #327

Open HuaxinLu opened 2 years ago

HuaxinLu commented 2 years ago

The init process systemd calls selinux_init_load_policy function to load policy when system start. Under the special conditions that the maximum supported policy version of the kernel is lower than the current policy version, the policy downgrade will happen in selinux_mkload_policy function.

During the downgrade, a lot of memory allocations and frees will happen. Due to the memory management, part of the physical memory cannot be memory reclaimed after downgrade, despite the fact that they have been freed. That will cause the system's available memory to decrease because systemd process will not exit.

I suggest that the malloc_trim can be called after selinux_init_load_policy to force memory recovery reclaim.

For example, I test with 4.19 kernel with 3.1 selinux packages:

# cat /sys/fs/selinux/policyvers 
31
# ls /etc/selinux/targeted/policy/
policy.32

The memory usage of systemd is large:

# cat /proc/1/status | grep VmRSS
VmRSS:     94568 kB

After patching code as follow:

    /* Load the policy. */
-   return selinux_mkload_policy(0);
+   rc = selinux_mkload_policy(0);
+   malloc_trim(0);
+   return rc;

The memory usage of systemd can be decreased:

# cat /proc/1/status | grep VmRSS
VmRSS:     17160 kB
cgzones commented 2 years ago

@poettering could you comment whether this is a sane thing to do in PID 1 at boot time?

poettering commented 2 years ago

Sure, we can do that. But I am not sure I understand the effect of it in full. Returning the memory to the kernel might slow things down for us if we end up needing it for something else later. It appears to me that we should only call this once pid1 initialization is complete and we are idle (i.e. from an sd_event_add_defer() handler), and probably independently of selinux code, i.e. do this always. and someone needs to do some profiling how much this actually does IRL.