Mellvik / TLVC

Tiny Linux for Vintage Computers
Other
7 stars 0 forks source link

More optimizations of the ps command #23

Closed Mellvik closed 10 months ago

Mellvik commented 10 months ago

The ps command has been unbearably slow on floppy based systems because of excessive disk reads.

The problem was partly fixed in the previous ps PR by replacing the getpwent open and read of the /etc/passwd file once per line of output with a local getpwent-version that kept /etc/passwd open. Good but still slow.

The other contributor to the bad ps performance was the ttyname lookup, which opened and scanned the entire /dev directory for every line out output. This version keeps the /dev directory open and also optimizes the search for tty devices in several ways.

ps performance on floppy based systems with limited memory (such as L1 buffers only) is now quite good.

Finally, while ps has always done some sanity checking of the kernel data read from memory, it has just exited silently when the data didn't make sense. This version adds more sanity checking and prints a message recommending ps be recompiled if the expected kernel signature isn't found. Courtesy @ghaerr from the ELKS project.

ghaerr commented 10 months ago

Glad to see this improved version you've been talking about - as it exposes other problems in the C library.

It is hard to see (and feel) these issues when you're running on QEMU all the time, as I am. I think I'm going to add an option to simulate real FD delays when running QEMU so that speed problems are more easily seen.

I now see that most of the getpwuid problems fixed here really should be fixed in the C library. It looks like getpwuid calls __getpwent directly, even though there's an open-cached version getpwent which keeps the file open using the old UNIX setpwent and endpwent functions - but they're not used in the library! The getpwuid function should use the open-cached getpwent and also do its own "last uid" caching like you've implemented.

The situation gets worse with ls which has the same problem, but worse: it calls getpwuid and grgetgid for each long listed file, and both are uncached, both open /etc/passwd each time and start from scratch! Talk about slow. I'll take a pass at rewriting all of them.

Have you found that ls -l seems pretty slow on FD?

ghaerr commented 10 months ago

The kernel buffer system should by rights be handling the problem of re-opening a file quickly, seeking and re-reading. So in some sense I'm a bit surprised at why ps became so slow on floppy. However, the size of the directories and files matter.

I recently reduced the size of /etc/passwd to under 512 bytes (that matters for FAT, unlike MINIX's 1K block size), and /etc/group was already pretty small. The size of /dev on a MINIX 2880k floppy is 816 bytes. So all should be read and buffered. What's your take on why ps is slow?

I have also noticed on some occasions when the kernel is in a strange state that ps runs extremely slowly, even on QEMU. I've never found the reason for that.

Mellvik commented 10 months ago

Thanks, @ghaerr. I appreciate your interest in this. I also admit I went somewhat overboard with ps, as in 'how far can you stretch it'. Which means not all optimizations make noticable differences.

That said, in a constrained setting, which I had, the new ps is great. Constrained meaning very limited buffer space, which was a natural given what we've been playing with these last weeks.

I believe the changes you're looking at for getpwent et al, library version, are very viable. BTW, ls -l has been slow but not unbearably so like ps.

I've been purposely avoiding 'plenty buffer' situations, which I suspect will void my ttyname optimizations - partly or completely. Actually, when running mostly off of floppy, booting off hd (or running qemu or xms buffers) is like a revelation....

Mellvik commented 10 months ago

What's your take on why ps is slow?

I went back to my development notes on this to check if I missed something. I did. After the repeated reopens were eliminated, competition for buffers was a problem I two ways. One was performance, the other that the tool was interfering with the testing. So while at it, I went to town on eliminating whatever buffer accesses I could. The result - and running ps while file transfers to floppy were going on, was - well, extreme. And to me, as a tool, it has improved a lot.

I have also noticed on some occasions when the kernel is in a strange state that ps runs extremely slowly, even on QEMU. I've never found the reason for that.

Yes, I've seen this one too and it never made any sense. It will be interesting to see if the optimized version does that. I do suspect though, that it's just a QEMU peculiarity. I haven't seen it for a while, but then again I'm not using QEMU by far as much as you do.

ls -l is different than ps as it never feels or looks like stuttering off of a 300 baud line, like ps used to do. It takes a while (ls -l /bin on floppy, 4-8 seconds depending on the system), but unlike ps it makes sense. There's a lot to read and sort.