Closed Mellvik closed 10 months ago
Glad to see this improved version you've been talking about - as it exposes other problems in the C library.
It is hard to see (and feel) these issues when you're running on QEMU all the time, as I am. I think I'm going to add an option to simulate real FD delays when running QEMU so that speed problems are more easily seen.
I now see that most of the getpwuid
problems fixed here really should be fixed in the C library. It looks like getpwuid
calls __getpwent
directly, even though there's an open-cached version getpwent
which keeps the file open using the old UNIX setpwent
and endpwent
functions - but they're not used in the library! The getpwuid
function should use the open-cached getpwent
and also do its own "last uid" caching like you've implemented.
The situation gets worse with ls
which has the same problem, but worse: it calls getpwuid
and grgetgid
for each long listed file, and both are uncached, both open /etc/passwd each time and start from scratch! Talk about slow. I'll take a pass at rewriting all of them.
Have you found that ls -l
seems pretty slow on FD?
The kernel buffer system should by rights be handling the problem of re-opening a file quickly, seeking and re-reading. So in some sense I'm a bit surprised at why ps
became so slow on floppy. However, the size of the directories and files matter.
I recently reduced the size of /etc/passwd to under 512 bytes (that matters for FAT, unlike MINIX's 1K block size), and /etc/group was already pretty small. The size of /dev on a MINIX 2880k floppy is 816 bytes. So all should be read and buffered. What's your take on why ps
is slow?
I have also noticed on some occasions when the kernel is in a strange state that ps
runs extremely slowly, even on QEMU. I've never found the reason for that.
Thanks, @ghaerr. I appreciate your interest in this. I also admit I went somewhat overboard with ps, as in 'how far can you stretch it'. Which means not all optimizations make noticable differences.
That said, in a constrained setting, which I had, the new ps
is great. Constrained meaning very limited buffer space, which was a natural given what we've been playing with these last weeks.
I believe the changes you're looking at for getpwent et al, library version, are very viable. BTW, ls -l
has been slow but not unbearably so like ps
.
I've been purposely avoiding 'plenty buffer' situations, which I suspect will void my ttyname optimizations - partly or completely. Actually, when running mostly off of floppy, booting off hd (or running qemu or xms buffers) is like a revelation....
What's your take on why
ps
is slow?
I went back to my development notes on this to check if I missed something. I did. After the repeated reopens were eliminated, competition for buffers was a problem I two ways. One was performance, the other that the tool was interfering with the testing. So while at it, I went to town on eliminating whatever buffer accesses I could. The result - and running ps
while file transfers to floppy were going on, was - well, extreme. And to me, as a tool, it has improved a lot.
I have also noticed on some occasions when the kernel is in a strange state that
ps
runs extremely slowly, even on QEMU. I've never found the reason for that.
Yes, I've seen this one too and it never made any sense. It will be interesting to see if the optimized version does that. I do suspect though, that it's just a QEMU peculiarity. I haven't seen it for a while, but then again I'm not using QEMU by far as much as you do.
ls -l
is different than ps
as it never feels or looks like stuttering off of a 300 baud line, like ps
used to do. It takes a while (ls -l /bin
on floppy, 4-8 seconds depending on the system), but unlike ps
it makes sense. There's a lot to read and sort.
The
ps
command has been unbearably slow on floppy based systems because of excessive disk reads.The problem was partly fixed in the previous
ps
PR by replacing thegetpwent
open and read of the/etc/passwd
file once per line of output with a localgetpwent
-version that kept/etc/passwd
open. Good but still slow.The other contributor to the bad
ps
performance was the ttyname lookup, which opened and scanned the entire/dev
directory for every line out output. This version keeps the/dev
directory open and also optimizes the search for tty devices in several ways.ps
performance on floppy based systems with limited memory (such as L1 buffers only) is now quite good.Finally, while
ps
has always done some sanity checking of the kernel data read from memory, it has just exited silently when the data didn't make sense. This version adds more sanity checking and prints a message recommendingps
be recompiled if the expected kernel signature isn't found. Courtesy @ghaerr from the ELKS project.