Open BlobCodes opened 1 month ago
Thanks for digging into this 🫶
I'm wishing we'd implement more of these system functions in Crystal. It's nice and easy to use libc functions, but they have a complexity as well. And there are many pitfalls for using them correctly, demonstrated by the multiple different bugs we have identified in these relatively small pieces of code (#14619, #14615 and this one).
Of course, rolling our own implementation also adds complexity and we need to ensure correctness. We have some prior art to help with that though. And overall the implementations could end up simpler and more efficient. Looking up some string in /etc/passwd
should be easy enough. And we only need to allocate memory for the field we're actually looking for.
Adding nscd might be a bit more complicated, although I'm wondering if that's even much useful anymore... We might just get away without it.
Adding nscd might be a bit more complicated, although I'm wondering if that's even much useful anymore... We might just get away without it.
I think /etc/passwd and /etc/groups will basically always fit into a single 4K memory page, so there's probably no difference to sending a lookup request to nscd over a unix socket (musl also only uses nscd as a fallback, so it's probably alright).
Manually implementing the file parsing also allows us to trivially implement methods like System::User#each
.
Bug Report
Follow-up to #14619. After creating the mentioned previous issue, I noticed that even after fixing the code, I couldn't fetch any users anymore. Every user lookup raised the following error:
getpwuid_r: Numerical result out of range (RuntimeError)
.It turns out that there is absolutely no limit on the maximum size of a user entry. Also, the GLibC implementation of
getpwuid_r()
seems to instantly returnERANGE
if any user entry is bigger than the provided buffer. Now, since crystal has a hard limit on the supplied buffer size (even though linux has none), every user lookup fails reliably.How to reproduce (glibc)
Simple fix
The simple fix is to remove the limit on the buffer size (GETPW_R_SIZE_MAX).
The GlibC implementation of
getpwuid
, which also only delegates togetpwuid_r
with a growing buffer, also has no limit on the buffer size. It just makes sure to return a proper error if the system is out of memory.Relevant code sections: https://github.com/bminor/glibc/blob/6d3b523eb54198d15af6e042054912cf9f5210de/nss/getpwuid.c#L21-L28 https://github.com/bminor/glibc/blob/6d3b523eb54198d15af6e042054912cf9f5210de/nss/getXXbyYY.c#L139-L159
Note that the repeatedly nested macros
INTERNAL (REENTRANT_NAME)
resolve to __getpwuid_r() in this example.Alternative
After looking at what GLibC does to implement this, I looked at what musl libc is doing.
Since the password file is just a normal file (and not a syscall or anything fancy), musl just traverses
/etc/passwd
usingfopen
and compares the searched uid/name sequentially (although it uses nscd as a fallback).Relevant code sections: https://github.com/bminor/musl/blob/007997299248b8682dcbb73595c53dfe86071c83/src/passwd/getpwent.c#L25-L30 https://github.com/bminor/musl/blob/007997299248b8682dcbb73595c53dfe86071c83/src/passwd/getpw_a.c#L30-L41