svenbieg / Heap

Cluster-based memory-manager written in C
4 stars 0 forks source link

Example code and clarification #1

Closed jnorre closed 7 months ago

jnorre commented 7 months ago

Hi Sven,

I saw your comment on the FreeRTOS forum, topic "What could FreeRTOS do next? Share your Ideas", and was intrigued by your very first post, where you state your memory manager can replace all 5 of FreeRTOS'es managers and the "fragmentation is no problem" statement. So I downloaded it to see if it really, IMO, could replace the memory managers provided by FreeRTOS. So here is my thoughts:

  1. I really, really miss a working example. This would might also answers some of my questions.
  2. If this meant as a replacement for the FreeRTOS memory managers, I would expect it to implement the pvPortMalloc, vPortFree, vPortInitialiseBlocks, and xPortGetFreeHeapSize as needed by FreeRTOS.
  3. Especially, where is the GetFreeHeapSize function - as your code allocates the clusters from the memory supplied to "heapCreate", how does I know how much memory is available afterward? Note: FreeRTOS itself does not seem to use xPortGetFreeHeapSize at the moment, but the user code can use the functions above (pvPortMalloc, vPortFree, vPortInitialiseBlocks, and xPortGetFreeHeapSize), so they need to be present.
  4. How does it replace the FreeRTOS "heap_5.c", which allows for the heap to be spread over multiple non-contigous blocks? E.g. NXP LPC4088 has a non-contigous memory map.
  5. Sometime you really, really need all memory you can get you hand on - in this case "heap_1.c" (only allocation, no freeing) comes in handy - this memory manager ("heap_1.c") has a extremely small overhead of SIZE_T bytes. Again, how does your code replace this particular memory manager?
  6. The heap_3.c memory manager is explicitly made to use the compiler malloc and free - think using FreeRTOS in a Windows or Linux environment, where we have memory swapping and (in some case) the ability to add memory if needed. This will be handled by the standard malloc/free functions on these platforms.
  7. I fail to see how is does away with the problem of fragmentation (from the wiki: "Fragmentation is no problem, the heap won't overflow")! Creating a heap with heap_create (let's say 2000 bytes + what is needed for the cluster), then allocate 500 bytes, 250 bytes, 500 bytes, 250 bytes and 500 bytes (the heap is now full), then free the two 250 bytes blocks and try to allocate 500 bytes - this fails (returns NULL) due to fragmentation, which is what I think of as an overflow. So your code will concatenate blocks, if possible, but so will heap_4.c.

So while the constant time allocation and deletion is indeed elegant, I can not see this can be used in place of all the existing memory managers, except possibly heap_4.c - in which case it should be extended to be thread safe. But "heap_4.c" is more memory efficient as it uses the freed blocks to store the free block information, so maybe your code should be considered as "heap_6.c" where we could trade memory for speed - I could indeed think of a few cases where this would be useful.

Best regards, Johnny Norre

svenbieg commented 7 months ago

Hi Johnny,

seems like You'll be the first having a memory-manager for free, let's give us a try!

1.) I have a working example for ESP32, please take with care! I only tested this one time.

heap.zip

2.) This is my implementation, it needs to be adjusted for FreeRTOS. Something like this:

`` vPort...(...)

ifndef UNICORE

EnterCriticalSection(); LockMutex(heap_mutex);

endif

heap...;

ifndef UNICORE

UnlockMutex(heap_mutex); LeaveCriticalSection();

endif

``

InitializeBlocks is unknown to me, i don't have multiple processes. Paging is too expensive, my memory-manager is already.

3.) The free heap-size is in the header: available=free+(size-used);

4.) You can add a next_heap pointer and some flags to the header, i only have one heap. You can find a good example in the ZIP-file above.

5.) Under extreme low memory conditions, i'd consider not using the heap at all.

6.) The standard-functions can be renamed by the linker to call Your vPortMalloc and vPortFree. If a process is trying to access a page that is not valid, the kernel gets called to provide it. Too expensive for any RTOS in my opinion.

7.) You are looking at memory-management in a too theoretical way. Your heap will have many KB, if it is full it is full. The meaning of not overflowing is, not allocating from the foot as long as possible. You will have to enter Your map while You are in it. This really seems to be a problem in IT, it is impossible with IBMs btree. A lot of students are trying to fix this right now.

I don't think so. You won't be writing an application in C anymore, You can have C++ for free. This is the only thing the Kernel should provide, the ability to get Your C++ application running. This is task-switching and memory-management. With Your memory-manager FreeRTOS can get fully certified, a lot of people are missing this feature.

Thank You for Your investigation and best regards,

Sven Bieg

svenbieg commented 7 months ago

Johnny,

i'm loading my C++ application right into the Kernel. I really want to know how it works.

Good luck!

svenbieg commented 7 months ago

Maybe You can find some answers here, too:

https://forums.freertos.org/t/real-time-memory-manager/19685

I'm online every day, please tell me about Your progress!

jnorre commented 7 months ago

Hi Sven, I think we are talking about two different things: I read it as you proposed to replace the 5 memory managers in FreeRTOS (of which you select one for your project) with the one you have made. These memory managers are required by FreeRTOS to allocate memory for tasks (and optionally, for application use) UNLESS you use the xxxStatic functions.

  1. I was just providing some feed-back regarding how FreeRTOS uses the selected memory manager and give a hint regarding why there many. I do not have the time right now to download and install ESP IDF to test the example, so I just peeked into the files you provided looking for an entry point [main()] and did not find it. I did not find a call to heap_create either - are you sure the file you linked has a working example?
  2. You write you do not have multiple processes? I assume your native language is german and my native is Danish - does "multiple processes" mean "multiple tasks"? If you do not have multiple tasks, why are you using FreeRTOS?
  3. As I mentioned, it is needed if your heap is to replace or suppliment any of the included heap options in FreeRTOS. The included heap options are complete - just select one and include it in the project. I would expect a suggested replacement/compliment heap to provide the same user friendlyness :-)
  4. As mentioned above, I do not think your have linked the correct file...
  5. Using the xxxStatic functions could definitely be the way to go in low memory situations. But if you are ready to discard a memory manager altogether, then you can see that a single memory manager can not fit all use cases.
  6. It is the other way around - heap_3.c is a thin wrapper around the FreeRTOS heap, the heap used by FreeRTOS. Remapping malloc and free to pvPortMalloc and vPortFree will result in an immediate stack-overflow as FreeRTOS will call e.g. pvPortMalloc, which calls malloc, which calls pvPortMalloc - remember the heaps provided by FreeRTOS are for use by FreeRTOS - the ability for the application to use these functions is a relatively new FreeRTOS invention - in the "old days" these function could only be used by FreeRTOS.
  7. I do not think so. I look at it in a very practical way for use by FreeRTOS. There are no single perfect heap allocator that works for every situation - your trades memory usage for speed for example. heap_1.c is super fast (even faster than your heap) and uses only size_t memory bytes (4 or 8 bytes for most uProcessors), but has no free. heap_3.c uses the malloc/free built into the compiler (so it uses no memory in itself, but the actual memory cost is determined by the compilers implemetation of malloc and free) , which for large targets (e.g. amd64) will normally provide a full memory manager, with memory reclaiming and memory consolidation through the use of memory remapping provided at processor level. heap_3.c uses with compilers for lesser targets might actually free anything because it is to expensive. heap_4.c is fast in most situations (but not constant time in all cases), can reclaim freed memory and uses just a little more memory than heap_1.c (and less than your heap). heap_5.c supports multiple non-contiguous memory areas, can allow and reclaim freed memory. I select the memory manager that is right for my project. And while I can see where your heap can be used as an alternative to heap_4.c (and if modified, heap_5.c) it can only be that - an alternative.

So this is my observation: The FreeRTOS heaps are primarily for use by FreeRTOS. The provided heaps covers a lot of use cases - personally I have been using heap_1.c, heap_2.c (before heap_4.c was recommended over heap_2.c), heap_3.c (enforced by Qualcomm) and heap_4.c. These have been selected on the basis of what was needed for the project at hand. If your heap has been selectable as heap_6.c, I would not have selected it for any of the projects I've been a part of, but I could - if heap_6.c was an option - see, cases where we would have to consider the use of heap_6.c. But it is not a general replacement for all the heap options - in my opinion.

Best regards, Johnny Norre

svenbieg commented 7 months ago

Good morning Johnny,

1.) Yes, this is a multi_heap component for ESP-IDF like heap_05

2.) I have multiple tasks and multiple cores. FreeRTOS provides Round Robin task-switching and locking primitives.

3.) I provided my implementation to show how real-time applications can be managed, i'm not working on FreeRTOS. Only heap_1 may be used in critical environments, but there is no deletion. If You implement my manager You are able to free memory.

4.) I have tested my heap in a simple Windows Console application with Visual Studio, You can paste the files in this repository.

5.) I'm not sure if there really is a practical use case. The operating system has to run my C++ application with some kind of hardware abstraction.

6.) You should be able to use my heap in Windows and Linux, too. Accessing an invalid page will call the kernel to provide it. My heap is a replacement for the standard-implementation too, it is fully featured.

7.) This is because You only use the heap and not provide one. You can see my heap as a replacement for heap_4 and heap_2 with the ability to get certified. heap_1 is not interesting anymore then. heap_3 and heap_5 can be built on top of it, so there is only one heap left.

It is not on me to do so, i'm not selling FreeRTOS. I just found this solution by the way while developing a database.

Maybe You have a look at my post on the FreeRTOS forum again, there is somebody familar with memory-management. He considers using it in his next project, the adaptation is really easy if You know how.

I can't promise You get my heap with FreeRTOS, but they probably will implement it for certification. You just have to wait.

See You!

jnorre commented 7 months ago

Hi,

In many cases heap_1.c is the way to go, because many embedded projects does not need to free or reallocate anything. If this is the case heap_1.c is perfect.

If freeing is needed, we can use heap_2.c, heap_3.c, heap_4.c or heap_5.c. Each of these provide freeing and reallocation. Each of these has certain advantages and disavantages as I tried to explain in the two previous comments.

Your heap likewise has some advantages and disavantages - it provides constant time allocation and deallocation, but it comes with the cost of a high memory use and for many embedded projects, memory is a very limited resource.

For example, if I use your library to create a heap of 2500 bytes, I can only allocate 20 blocks of 100 bytes - up to 500 bytes (20%) has been used internally by your code. This may be fine for your project, but it is not fine for my projects. Using heap_1.c or heap_4.c (I did not test the others) I can allocate 25 blocks of 100 bytes from a 2500 bytes heap. So the trade off for your heap is memory usage.

This is one reason why your heap can not be a general replacement for the heaps already provided by FreeRTOS. Another is the requirements by certain standards, like DO-178B which does not change just because we have a new heap manager - the problem with freed, dangling and wild pointers remains no matter which memory manager is used.

After examining the memory used for housekeeping by your heap I doubt it will be accepted into the FreeRTOS project as the memory disadvantage totally outweighs the constant time advantage.

Just my 5 cents, Johnny Norre

svenbieg commented 7 months ago

Thank You, Johnny!

I don't agree with You when You talk about short memory. I had about 300kB available with my memory-manager on ESP32, while having a display and a web-server running.

Your view on FreeRTOS is interesting anyways. I still believe my way is the easiest, not easy as You see in the source-code.

Best regards,

Sven Bieg

svenbieg commented 7 months ago

It's 8 byte per allocation with 32bit, no discussion. The size of the map is irrelevant, because it is free memory. My most common objects are Strings, 8 byte in size. A ref-count and a pointer to the buffer. Now You can calculate how fast even the smallest gaps disappear, also the map. If there is no gap, there is no map. For my applications i would increase the minimum block-size to 16 bytes, for the heap-manager itself it is 12 bytes.

Good evening,

Sven

svenbieg commented 7 months ago

Loose pointers?

https://github.com/svenbieg/Shared/tree/main/Default

I don't have anymore.

jnorre commented 7 months ago
#include <stdio.h>
#include <stdint.h>
#include "include/heap.h"

#define HEAPSIZE 2500
#define ALLOCSIZE 100
uint8_t heap_buffer[HEAPSIZE];

heap_handle_t hMyheap;
void* m[HEAPSIZE/ALLOCSIZE+1]; // Ideally we can allocate this number of blocks - the +1 will overflow the heap

int main() {
    int allocCount = 0;
    hMyheap = heap_create((size_t)heap_buffer,HEAPSIZE);

    while (1) {
        m[allocCount] = heap_alloc(hMyheap,ALLOCSIZE);
        if (!m[allocCount]) {
            printf("%d buffers was allocated\r\n",allocCount);
            return 0;
        }
        allocCount++;
    }
    return 0;
}

Compiled with on Ubuntu 22.04 for linux amd64 architecture (using gcc block_map.c cluster_group.c heap_block.c parent_group.c offset_index.c heap.c test.c -I include -o test -g - this allocated 20 buffers - not 25 buffers as expected (and as the FreeRTOS heaps does, except heap3.c which uses malloc and all system memory). Single stepping the code into heap_alloc line 42 in heap.c the line size=heap_block_calc_size(size); sets size to 120! This is 20 bytes more than the 100 bytes requested. So on a 64 bit achitecture (size_t = 8 bytes) it is indeed 20 bytes overhead per allocation. The heap_1.c and heap_2.c in FreeRTOS has 0 (zero) bytes in overhead per allocation, heap_4.c and heap_5.c a little overhead for small allocations (to allow space for the free map).

Best regards, Johnny Norre

svenbieg commented 7 months ago

heap_block_calc_size(size); sets size to 120!

The calculation is the minimum multiple of 4 bytes + 8 bytes. 100 -> 108

static inline size_t heap_block_calc_size(size_t size) { return align_up(size, sizeof(size_t))+2*sizeof(size_t); }

Maybe there is something wrong in align_up, this function is relativelly new.

static inline size_t align_up(size_t value, size_t align) { return (value+align-1)&~(align-1); }

I get this from Copilot:

return x + (alignment - x % alignment) % alignment;

Thank You very much, i didn't recognize. I've updated my repository.

svenbieg commented 7 months ago

Here is my test:

HeapTest.zip

svenbieg commented 7 months ago

I'm still not able to give You any warranty on my code, my test is crashing with 64 bit now.

Sorry, i've been working all day. It is just too much for me at the moment.

jnorre commented 7 months ago

Hi Sven,

The 20 bytes comes from 2*sizeof(size_t) = 16 + alignment from 100 bytes to 104 on a 64-bit platform. On a 32-bit platform, 2*sizeof(size_t)=8 + 0 bytes alignment, so we are both right - it depends on the platform. Peeking at you example, remember to make tests with uneven allocation sizes and dis-uniform deallocation - the test you have now is fine, but it does not stress the heap in any way. From another test I did of your heap, it seems like it has a "first fit" approach, where e.g. FreeRTOS heap_2.c has a "best fit" approach, so it will reuse the memory that fits best into a free block - this makes better use of freed memory for the cost of more time used. Regarding having 300k remaining on your ESP32 platform - that's fine, but my platform could be a STM32F103x4 with 16k Flash and 6k RAM - here every byte counts if you need to use FreeRTOS. Finally, remember - I am looking at your heap library exclusively to see if it can be used as a heap in FreeRTOS - I am not out to criticize your work at all or comment on how it fit into your projects. But for FreeRTOS, it uses to much memory and most well-behaved FreeRTOS cased programs only allocates tasks, queues, semaphores, mutexes, timers and event groups (the stuff that FreeRTOS use the heap for) once in the initialization and uses these throughout the program, so allocation times and deallocation times are of little or no use for these programs. Your use case for the heap is different - I get that and wish you good luck with the heap and your database project.

Best regards, Johnny Norre

svenbieg commented 7 months ago

Please don't get me wrong, but You are still wrong. My heap is providing best fit at constant low time with minimal overhead. I also did a test with variable sized buffers and couldn't even find Your problem that we have solved together. I have made a stress-test where i deleted every second entry and allocated again. Fragmentation doesn't even slow down my manager. It does, but like with every pyramidal directory, You need 10 times more gaps to double my 60 microseconds at average. Like i said, with only 6kB of RAM i'd consider not using the heap at all. My problem is Elon Musk on TV, having multiple billions of dollars and not even one memory-manager. I guess You are here one genetation before me. Every smart-watch has enough memory to drive FreeRTOS with my memory-manager. I don't want to force You, it is not in our hands.

Please feel free to contact me anytime, maybe You can find a mistake in my code before me.

Good night my new friend from Denmark!

jnorre commented 7 months ago

To me, and from the diagram on the wiki, your clusters looks like an unbalanced 2-3-tree - invented in by John Hopcroft in 1970. Binary trees, 2-3 trees, b-trees, balanced and unbalanced trees was taught when I went to university - I sit here with the book "Data structures and algorithms", by Aho, Hopcroft, and Ullman, ISBN 0-201-00023-7 from 1983 which has these data structures and many more.

If the heap uses "best fit" it will be able to complete this program without any allocation errors (compile as mentioned above):

#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#include "include/heap.h"

#define HEAPSIZE 1750
uint8_t heap_buffer[HEAPSIZE];

heap_handle_t hMyheap;
void* m[5]; 

int main() {
    int allocCount = 0;
    hMyheap = heap_create((size_t)heap_buffer,HEAPSIZE);

    m[0] = heap_alloc(hMyheap,100);
    m[1] = heap_alloc(hMyheap,200);
    m[2] = heap_alloc(hMyheap,300);
    m[3] = heap_alloc(hMyheap,400);
    m[4] = heap_alloc(hMyheap,500); // 1500 bytes allocated

    heap_free(hMyheap,m[1]);
    heap_free(hMyheap,m[2]);
    heap_free(hMyheap,m[3]); // 900 bytes freed

    m[1] = heap_alloc(hMyheap,400);
    assert(m[1]);
    m[2] = heap_alloc(hMyheap,300);
    assert(m[2]);
    m[3] = heap_alloc(hMyheap,200); // 900 bytes reallocated, but in reverse order
    assert(m[3]);

    return 0;
}

It did not in a previous test I conducted, but now it just crashes inside the heap library at the second heap_free in line 23.

I do not want to use anymore time on this subject so I will unsubscribe from this issue.

Best regards, Johnny Norre

svenbieg commented 7 months ago

With btree it is just impossible to remove an entry while adding one. Clusters are different giving You a better overview. Believe it or not!