The current memory allocator we use is very primitive and has both poor performance, while being prone to fragmentation. Rather than designing our own allocator from scratch (which would be a waste of time), we should implement the Two-Level Segregated Fit (TLSF) algorithm.
Some design goals for this implementation are:
We should be able to add and remove memory pools on the fly to the allocator.
We might want to eventually use a single allocator for all geometry, rather than an allocator per region. Since we can't pre-allocate gigabytes of memory at startup, we will need to dynamically resize the allocator at runtime by adding and removing pools of memory (OpenGL buffers).
These memory pools would likely be fairly large (at least 64 MB), and the total address space of the allocator does not need to be larger than 32 GiB. This creates a limit where there would at most be 512 memory pools.
The time complexity of allocating and freeing memory should not grow with the number of allocations. This will become even more important if we only have one allocator instance per world.
Defragmenting the memory pool should be possible on an incremental basis, without introducing significant latency spikes.
Right now, we can afford to defragment the entire allocator at once for a region, since the amount of memory managed is fairly small (<64 MB), but this won't work if there's only one allocator per world.
Alignment does not need to be configurable at runtime. For relevant purposes (using the memory as UBO, SSBO, TBO, or VBO) we only need an alignment of 16 bytes to cover all cases.
The minimum allocation size can be fairly large (at least 64 bytes?), since the use case is for geometry where allocations contain at least one primitive.
Keeping the minimum allocation size large may also reduce bookkeeping overhead, since the minimum allocation size also becomes the minimum alignment size, which makes it possible to use compressed 32-bit pointers (i.e. shift the low bits out since they're always zero.)
The current memory allocator we use is very primitive and has both poor performance, while being prone to fragmentation. Rather than designing our own allocator from scratch (which would be a waste of time), we should implement the Two-Level Segregated Fit (TLSF) algorithm.
Some design goals for this implementation are: