[kernel] experimental: move SEG descriptors to high end of heap

This experimental change allocates SEG descriptors from the high end of the kernel heap in order to reduce heap fragmentation. The changes to the heap code is minimal (the printks are kept in for now) and more testing is needed for robustness.

Even more importantly it has to be determined whether the change makes any real difference. An update will make the addition conditional.

@ghaerr, a code /change review would be very valuable. I just noticed a bug which splits the 'big' free block - fix is coming. Regardless, the code illustrates the idea we talked about pretty well. The output is from QEMU.

meminfo now looks like this (reasonably fresh boot, process id's around 50):

  HEAP   TYPE  SIZE    SEG   TYPE    SIZE  CNT
  a15e   BUFH 10240
  c96a   BUFH   480
  cb56   TTY   1024
  cf62   TTY     80
  cfbe   NETW  1536
  d5ca   free  4632
  e7ee   TTY     80
  e84a   TTY     80
  e8a6   NETW  1536
  eeb2   NETW  1536
  f4be   NETW  1536
  faca   free   718
  fda4   SEG     16   5b66   free   27840    0
  fdc0   SEG     16   65c7   free    3248    0
  fddc   SEG     16   6232   DSEG   14672    1
  fdf8   SEG     16   52b4   CSEG    4192    1
  fe14   SEG     16   4f47   CSEG   14032    1
  fe30   SEG     16   53ba   free    1328    0
  fe4c   SEG     16   2daf   CSEG   46048    2
  fe68   SEG     16   6692   DSEG   57824    1
  fe84   SEG     16   74b0   free  176384    0
  fea0   SEG     16   47bb   free   25232    0
  febc   SEG     16   4192   DSEG   25232    1
  fed8   SEG     16   540d   CSEG   18032    1
  fef4   SEG     16   5874   DSEG   12064    1
  ff10   SEG     16   3f16   free   10176    0
  ff2c   SEG     16   4de4   DSEG    5680    1
  ff48   SEG     16   38ed   DSEG   25232    1
  ff64   SEG     16   2caf   CSEG    4096    1
  ff80   SEG     16   14f7   free     304    0
  ff9c   SEG     16   133e   CSEG    7056    1
  ffb8   SEG     16   2540   BUF    24576    1
  ffd4   SEG     16   2b40   DSEG    5872    1
  fff0   SEG     16   250a   free     864    0
  Heap/free   24238/ 5350 Total mem  509984
  Memory usage  498KB total,  258KB used,  240KB free

Correction: The allocation referred to above as a bug is actually correct behaviour. This is the way the heap alloc algorithm works. It will satisfy requests from the smallest free block that will hold the requested block.

In this case the block freed by net stop, 4 packet buffers = 6180 bytes, is larger than the free block at the end of the heap (in this case 5530), and the latter will be used till it's exhausted.

Regardless of the outcome of this experiment, this has been a very educating dive into the works of the kernel heap management system.

I haven't had time to jump into the details of your code, but I think I understand what you're doing - allocating from the high address end of any free block which is split to create a new used block. Potentially pretty cool idea. I see the following as issues:

I notice that each allocation (except for the first at 0xfff0) is 0x1c (=28) bytes, not 16, for the SEG entries. Perhaps the math calculating the size needs to be slightly adjusted for the high-address split? [EDIT: Oops - I see, the allocation overhead itself (struct heap) is 12 bytes, so this is a non-issue.]
Since the idea is based on splitting the best-fit free block from the high, rather than low address, and we want to keep the SEG allocations at high addresses, perhaps the "best fit" strategy should be ignored for SEG allocations, and always allocate from the highest free block?
The true test will be running this algorithm for quite some time, with lots of programs being started and stopped, thus allocating and freeing lots of SEGs. A potential problem could be that the SEG descriptors end up moving down through various lower address free segments, thus negating the idea of keeping the SEG descriptors in their own arena. (Having their own arena was my suggestion, but that would require allocating a fixed-size arena for just small or SEG descriptors. I'm really not sure what might work better, definitely needs testing).

Thank you, @ghaerr.

I haven't had time to jump into the details of your code, but I think I understand what you're doing - allocating from the high address end of any free block which is split to create a new used block. Potentially pretty cool idea. I see the following as issues:

it's not from any block, it's always from the same block.

Since the idea is based on splitting the best-fit free block from the high, rather than low address, and we want to keep the SEG allocations at high addresses, perhaps the "best fit" strategy should be ignored for SEG allocations, and always allocate from the highest free block?

Yes, that's exactly what's happening. The seg descriptor allocation is contiguous from the top. Seems to be working well (as in the example).

The true test will be running this algorithm for quite some time, with lots of programs being started and stopped, thus allocating and freeing lots of SEGs. A potential problem could be that the SEG descriptors end up moving down through various lower address free segments, thus negating the idea of keeping the SEG descriptors in their own arena. (Having their own arena was my suggestion, but that would require allocating a fixed-size arena for just small or SEG descriptors. I'm really not sure what might work better, definitely needs testing).

yes, that's the litmus test. As implemented that (i.e. moving down) won't happen, the system will fail if the highest free block runs out of space. That is not likely to happen - if it does, the system is so short on heap space it may not work at all. I need to bomb the system with inbound telnet and see what happens.

it's not from any block, it's always from the same block.

Yes, that's exactly what's happening. The seg descriptor allocation is contiguous from the top.

I see now. I was thinking, without having looked much at your changes, that the algorithm always searched for the highest-addressed block and allocated from the highest address within that block. Instead, you're saying, the highest-addressed block is set once, and then allocations are made from the highest address downwards only in that block. I'm not sure there's much difference between the two, except for the case where the highest free block could run out of space, as you mention.

the system will fail if the highest free block runs out of space. That is not likely to happen - if it does, the system is so short on heap space it may not work at all.

Is that true? What happens if a best fit match for a long-term resource happens to allocate from the high-heap block, and then sits there, while there is other free heap space available? In that case, system failure could occur with plenty of free space.

I need to bomb the system with inbound telnet and see what happens.

I'm definitely interested in what you learn. How/when does inbound telnet allocate kernel heap, aren't the NETW and SER buffers already allocated? Or are you talking about the rapid-fire fork()s occurring that allocate lots of SEGS (only)? Perhaps the system needs to be handling inbound telnet while logging in and our of the serial port simultaneously, which should cause SER buffers to be allocated concurrently with SEGs.

I'll try to think up more ways to best measure and/or test fragmentation. Agreed that if/when the system is running as tightly (meaning with very little free heap) as your system(s) are, its not clear any algorithm will work well. I suppose though its not a "small" allocation failure to be worried about, but rather the fragmentation such that a larger (1536 NETW or 1024 SER) allocation fails, with the result that the network or serial port driver open fails.

it's not from any block, it's always from the same block.

Yes, that's exactly what's happening. The seg descriptor allocation is contiguous from the top.

I see now. I was thinking, without having looked much at your changes, that the algorithm always searched for the highest-addressed block and allocated from the highest address within that block. Instead, you're saying, the highest-addressed block is set once, and then allocations are made from the highest address downwards only in that block. I'm not sure there's much difference between the two, except for the case where the highest free block could run out of space, as you mention.

The goal is to keep the SEG descriptors together which requires keeping track of the start of the highs (addr) free block (which at boot time is the entire heap btw). The key thing is that the logic is very simple.

the system will fail if the highest free block runs out of space. That is not likely to happen - if it does, the system is so short on heap space it may not work at all.

Is that true? What happens if a best fit match for a long-term resource happens to allocate from the high-heap block, and then sits there, while there is other free heap space available? In that case, system failure could occur with plenty of free space.

It's important to keep in mind that this has always been the case: Tight on heap, something is going to break. The question is whether the change improves or aggravates the situation. As available heap space decreases, and the size of the high free block decreases, there will likely be open slots in the 'SEG section' for reuse by fork. It's extremely unlikely that there will ever be a situation where there are plenty of free space and none in the high free block - although it's not impossible. Remember that the SEG allocations are now the smallest in the system, the next size is 80 (pipes), so allocations in tight situations will come from somewhere else - or fail, while there is still SEG space available. Again, practical use is the only reliable judge. What I've seen so far is very promising.

I need to bomb the system with inbound telnet and see what happens.

I'm definitely interested in what you learn. How/when does inbound telnet allocate kernel heap, aren't the NETW and SER buffers already allocated? Or are you talking about the rapid-fire fork()s occurring that allocate lots of SEGS (only)? Perhaps the system needs to be handling inbound telnet while logging in and out of the serial port simultaneously, which should cause SER buffers to be allocated concurrently with SEGs. Allocation of L1 buffers is very useful to create a tight environ.

Telnet - in and out - is very useful thanks to the ptys. Starting more gettys too (then killing them), I have 4 serial ports on one of the machines.

I'll try to think up more ways to best measure and/or test fragmentation. Agreed that if/when the system is running as tightly (meaning with very little free heap) as your system(s) are, its not clear any algorithm will work well. I suppose though its not a "small" allocation failure to be worried about, but rather the fragmentation such that a larger (1536 NETW or 1024 SER) allocation fails, with the result that the network or serial port driver open fails.

Agreed.

Mellvik / TLVC

[kernel] experimental: move SEG descriptors to high end of heap #36