AlternC / alternc-nss

3 stars 3 forks source link

OOM looking up for users/groups with libnss_systemd.so #10

Open tobald opened 8 months ago

tobald commented 8 months ago

Hash comments are not allowed in libnss definition files (passwd/group/shadow). Alternc-nss' current implementation wrongly adds hash comments in those files. We spotted the issue after migrating a server (16Go memory) to debian bullseye ,we then got multiple applications reaching memory limits while several gigs of free memory were available, for instance :

$ addgroup test_xxx
Out of memory!

Attached is a system call log which allowed us to pinpoint the issue and a naïve patch.

Note that other bullseye machines with more memory (+32Go) do not reach to OOM error while the bug is present, I suspect the libnss-extrausers definitions are ignored in that case.

adduser-strace.log 0001-prevent-user-group-lookups-errors-with-libnss_system.patch.log

camlafit commented 8 months ago

Hello

Very to have found a solution. We have noticed this behavior without time to investigate.

Looking about nss information, a workaround could be to add a space before # Could you tried it ?

As we can't know if extrauser is used to another usecase, we need to identify specific AlternC block code.

camlafit commented 8 months ago

I've checked about spacing workaround following musl or libc behavior could be different.

In both case pwck doesn't accept these lines. I've opened an branch to solve this issus and backported your patch as starting point

camlafit commented 8 months ago

Hello

I've a PR #11 with a less naive approach. Before to write any AlternC related value with remove old previous line then we can preserve any other row and append only AlternC specific content.

tobald commented 8 months ago

thanks for the quick response, there is one issue left : the last line of passwd/group/shadow is invalid because it is not terminated/misses an end-of-line.

tobald commented 8 months ago

line 98, write_content is dubious, won't work for passwd or shadow :

    $content_bck = implode("\n", $this->group);
camlafit commented 8 months ago

there is one issue left : the last line of passwd/group/shadow is invalid because it is not terminated/misses an end-of-line.

I've tried after purge backup and extrauser files and pwck don't raise any error. Do you have again this problem ? We can add a security with terminate return carrier

tobald commented 8 months ago

Do you have again this problem ?

Thanks, without line termination borgbackup raises MemoryError.

camlafit commented 8 months ago

Do you have again this problem ?

Thanks, without line termination borgbackup raises MemoryError.

Looks strange. I don't see reason why any backup could be raise any error about termination file. In any case return carrier is now forced :)

tobald commented 8 months ago

borgbackup looks for permissions, the posix standard definition of a line https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_206

camlafit commented 8 months ago

A new day to learn new thing :) I thinks it's solved by https://github.com/AlternC/alternc-nss/pull/11/commits/993ecfb70d43042f1db7f15c31c52b7738586a65

tobald commented 8 months ago

I fear your write_content() could lead to corruption, if manual changes occurs duplicate entries would then be inserted :

        if (file_exists($file_bck)) {
            $content_lines_bck = file($file_bck, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
            $content_lines = array_diff($content_lines, $content_lines_bck);
        }
        $content_lines = array_merge($content_lines, $content_new);

I think we should keep it simple and short and not try to manage things, document that when you install alternc-nss extrausers files are overwritten.

(sidenote there are validation tools like pwck)

camlafit commented 8 months ago

Hello

I don't want presume any use case. Then we need to check file and manage only AlternC part. We must be less intrusive as possible.

array_diff is here to exclude AlternC part following previous backup before to update it. We can add a flock check to prevent simultaneous update. But in this case I don't think is useful.

In my mind/mantra we must be kiss indeed and also to be less intrusive :)