Closed vhirtham closed 6 years ago
The general purpose memory is capable to deliver memory of desired size and alignment. Additionally allocations and deallocations are not restricted to any specific order like in some stack approaches.
There are some performance and memory costs involved to obtain this freedom:
Memory costs
The size of occupied memory blocks is stored in memory in front of the returned address during allocation. This information is needed during deallocation. Additionally the used alignment is stored in the preceeding byte of the returned memory pointer. Therefore the real size of each allocation is increased by the size of a size_t
(occupied space information) and the chosen alignment value.
The size of a free memory block is stored at the beginning of each free memory block (in free memory). Additionally the address of the next free memory block is stored. This enables one to traversing all free memory blocks as fast as possible and find a fitting free one during allocation. This requires from each free memory block to be large enough to store a size_t
and a `void*. If there is some memory left after an allocation and the size is not large enough, the memory is added to the allocation and therefore lost for future allocations until the allocated block is freed.
The free choice of allocation size and the unrestricted order of allocations and deallocations lead to fragmentation problems. Therefore the total memory should be larger than the theoretical maximum memory requirement of the program. Otherwise an allocation might fail, even though the total amount of memory would be enough.
Performance costs
During allocation and deallocation the linked list of free memory blocks needs to be updated if necessary. This adds some extra costs when compared to simpler memory models like memory pools or stacks. The involved write operations are more or less constant time, but during allocation and deallocation the linked list needs to be traversed. In case of the allocation it is traversed to find a free block of proper size. During deallocation the previous and following free memory block need to be found in order to update the linked list and eventually merge the newly freed memory with them. The costs of this list traversal highly depend on the memory fragmentation.
Necessary steps:
There are the following branches that need to be distinguished:
This makes 8 different cases in total. The consequences for each case are shown below. Prior to commit 082be6adf7b5eb60abd5a69bf8f5e249f9c38924 the implementation was exactly as in the list below. Afterwards it was simplified by reducing code duplications.
Results for all branch combinations:
FF = false && LF = false && MF = false
FF = false && LF = true && MF = false
FF = false && LF = false && MF = true
FF = false && LF = true && MF = true
FF = true && LF = false && MF = false
FF= true && LF= true && MF = false
FF= true && LF = false && MF = true
FF = true && LF = true && MF= true
Necessary steps:
These three subtasks of the section title have some common branches and are therefore joined. There are the following branches that need to be distinguished:
This makes 16 different cases in total. Not all of them need to be treated since the firs free element can't have a previous free block etc. Excluding all unnecessary branch combinations leaves 9 different cases that need to be treated. The cases and their consequences are shown below. Prior to commit 151ebf2c7f82c5182f693272e0ce970f08424229 the implementation was exactly as in the list below. Afterwards it was simplified by reducing code duplications.
Results for all possible branch combinations:
FF = true && LF = true && PM = --- && NM = ---
FF = true && LF = false && PM = --- && NM = true
FF = true && LF = false && PM = --- && NM = false
FF = false && LF = true && PM = true && NM = ---
FF = false && LF = true && PM = false && NM = ---
FF = false && LF = false && PM = true && NM = true
FF = false && LF = false && PM = false && NM = true
FF = false && LF = false && PM = true && NM = false
FF = false && LF = false && PM = false && NM = false
Finished with commit a16e50fdaddff9e3959975b601b737df67a77c81
General purpose memory / allocator
previous and thenext allocated object (or allocator memory startor end if there is no other object)doublelinked list approach in free memory to keep track of the gaps. Have a look at the implementation of the memory pool. This might change the structure of the header needed for allocated objects (see bullet point above).Memory pool / pool allocator
firstOnly numbers of first elements in a gap are stored and the number of following empty spotsMerge gaps regularly if possible. For example: every 100 allocs sort the gap list and merge gaps that are connected.Memory stack / stack allocator
Memory Manager Structure
Remarks
Class internals
Testing
Weblinks
ToDo
DeallocationMark CreateDeallocationMark()
Write dynamic allocator, which is allowed to select the source of memory (basically pool or general purpose depending on size)This would result in the same allocator as the pool allocator, since it backups to the general purpose memory if no fitting memory pool is found.Use std::align_val_t as type for alignment values in every implemented memory systemnot ideal for this lib and the implemented memory systemsAssertion that memory manager is deinitialized in dtorassertion hides thrown exceptionsFORCE_STD_ALLOCATOR
-> This should replace all custom allocator types with a simple typedef using std::allocator. (Maybe make it an option in CMake - better start a list of definitions which affect the library).FORCE_STD_ALLOCATOR
enabledFORCE_STD_ALLOCATOR
is defined, it simply does not perform theBOOST_CHECK
callsThink of using bad alloc exception instead of the GDL::Exception. In this case the exception classshould also use the GDL string. Maybe add a macro STD_EXCEPTIONException is derived from std::runtime_error. Can't change the string typeTests
#ifdef
to create multiple tests with different memory setups in the memory manager. This way there is no need for multiple test files. Just create multiple tests with the same file and different definitionsdynamic allocatornot needed, just use the pool allocatorBenchmark Check if using google benchmark for this test does make sense.
Measure std::allocator reference where new is able to use same memory blocks over and over again -> construction, allocation, destruction, repeat - this is the fastest possible new allocationThis can be done by setting the number of test runs in the current benchmark to 1 or higher