mtrebi / memory-allocators

Custom memory allocators in C++ to improve the performance of dynamic memory allocation
MIT License
1.75k stars 160 forks source link

There's a little confusion about the diagram #1

Closed chenbowalker closed 6 years ago

chenbowalker commented 6 years ago

hi, So lucky,I meet your github, your code of memory-allocator is very nice.But i can not understand the chart. (Time complexity of allocations/free blocksize = 4096)From the result of code(execute main), i don't get any correlation. I would be very grateful if you can answer my puzzlement. thank you.

mtrebi commented 6 years ago

Hey, I am glad you like my memory-allocators and I really appreciate that you take time to read what I did. This chart represents the time complexity O when allocating and deallocating (or freeing) memory blocks of constant size equals to 4096 bytes. You can get the exact numbers used for this chart by executing single allocations and deallocations with only one block of size 4096. I'm writing this from memory but It would look something like this:

Allocator * cAllocator = new CAllocator();
Allocator * linearAllocator = new LinearAllocator(1e9);
Allocator * stackAllocator = new StackAllocator(1e9);
Allocator * poolAllocator = new PoolAllocator(16777216, 4096);
Allocator * freeListAllocator = new FreeListAllocator(1e8, FreeListAllocator::PlacementPolicy::FIND_FIRST);

Benchmark benchmark(1e1);

std::cout << "C" << std::endl;
benchmark.SingleAllocation(cAllocator, 4096, 8);
benchmark.SingleFree(cAllocator, 4096, 8);

std::cout << "LINEAR" << std::endl;
benchmark.SingleAllocation(linearAllocator, 4096, 8);

std::cout << "STACK" << std::endl;
benchmark.SingleAllocation(stackAllocator, 4096, 8);
benchmark.SingleFree(stackAllocator, 4096, 8);

std::cout << "POOL" << std::endl;
benchmark.SingleAllocation(poolAllocator, 4096, 8);
benchmark.SingleFree(poolAllocator, 4096, 8);

std::cout << "FREE LIST" << std::endl;
benchmark.SingleAllocation(freeListAllocator, 4096, 8);
benchmark.SingleFree(freeListAllocator, 4096, 8);

delete cAllocator;
delete linearAllocator;
delete stackAllocator;
delete poolAllocator;

Hope this helps!

chenbowalker commented 6 years ago

Hey,I am very happy to receive your reply. Thank you for taking the time to answer my doubts. From the result of executing source code at your github, I find the Time elapsed of C is faster than FREE LIST. I do not know if my method is correct. The test result as follows:

C
                Operations:     10
        Time elapsed:   0.00252 ms
        Memory peak:    0 bytes

        Operations:     10
        Time elapsed:   0.005174 ms
        Memory peak:    0 bytes

        Operations:     10
        Time elapsed:   0.001006 ms
        Memory peak:    0 bytes

        Operations:     10
        Time elapsed:   0.006157 ms
        Memory peak:    0 bytes

        Operations:     10
        Time elapsed:   0.005923 ms
        Memory peak:    0 bytes

        Operations:     10
        Time elapsed:   0.013787 ms
        Memory peak:    0 bytes

        Operations:     10
        Time elapsed:   0.025704 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       32
    Alignment   8
        Operations:     10
        Time elapsed:   0.00427 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       64
    Alignment   8
        Operations:     10
        Time elapsed:   0.007967 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       256
    Alignment   8
        Operations:     10
        Time elapsed:   0.002314 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       512
    Alignment   8
        Operations:     10
        Time elapsed:   0.005344 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       1024
    Alignment   8
        Operations:     10
        Time elapsed:   0.00614 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       2048
    Alignment   8
        Operations:     10
        Time elapsed:   0.00693 ms
        Memory peak:    0 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       4096
    Alignment   8
        Operations:     10
        Time elapsed:   0.01332 ms
        Memory peak:    0 bytes

    BENCHMARK: ALLOCATION
        Operations:     10
        Time elapsed:   0.00337 ms
        Memory peak:    0 bytes

    BENCHMARK: ALLOCATION/FREE
        Operations:     10
        Time elapsed:   0.003876 ms
        Memory peak:    0 bytes

LINEAR
        Operations:     10
        Time elapsed:   0.014144 ms
        Memory peak:    320 bytes

        Operations:     10
        Time elapsed:   0.029473 ms
        Memory peak:    640 bytes

        Operations:     10
        Time elapsed:   0.023813 ms
        Memory peak:    2560 bytes

        Operations:     10
        Time elapsed:   0.022343 ms
        Memory peak:    5120 bytes

        Operations:     10
        Time elapsed:   0.021823 ms
        Memory peak:    10240 bytes

        Operations:     10
        Time elapsed:   0.021507 ms
        Memory peak:    20480 bytes

        Operations:     10
        Time elapsed:   0.021526 ms
        Memory peak:    40960 bytes

    BENCHMARK: ALLOCATION
        Operations:     10
        Time elapsed:   0.023423 ms
        Memory peak:    40960 bytes

STACK
        Operations:     10
        Time elapsed:   0.01127 ms
        Memory peak:    400 bytes

        Operations:     10
        Time elapsed:   0.026964 ms
        Memory peak:    720 bytes

        Operations:     10
        Time elapsed:   0.023693 ms
        Memory peak:    2640 bytes

        Operations:     10
        Time elapsed:   0.023313 ms
        Memory peak:    5200 bytes

        Operations:     10
        Time elapsed:   0.022683 ms
        Memory peak:    10320 bytes

        Operations:     10
        Time elapsed:   0.022577 ms
        Memory peak:    20560 bytes

        Operations:     10
        Time elapsed:   0.022637 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       32
    Alignment   8
        Operations:     10
        Time elapsed:   0.023883 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       64
    Alignment   8
        Operations:     10
        Time elapsed:   0.03375 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       256
    Alignment   8
        Operations:     10
        Time elapsed:   0.02418 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       512
    Alignment   8
        Operations:     10
        Time elapsed:   0.024647 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       1024
    Alignment   8
        Operations:     10
        Time elapsed:   0.02554 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       2048
    Alignment   8
        Operations:     10
        Time elapsed:   0.02827 ms
        Memory peak:    41040 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       4096
    Alignment   8
        Operations:     10
        Time elapsed:   0.03263 ms
        Memory peak:    41040 bytes

    BENCHMARK: ALLOCATION
        Operations:     10
        Time elapsed:   0.02433 ms
        Memory peak:    41040 bytes

    BENCHMARK: ALLOCATION/FREE
        Operations:     10
        Time elapsed:   0.031793 ms
        Memory peak:    41040 bytes

POOL
        Operations:     10
        Time elapsed:   9.2315 ms
        Memory peak:    40960 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       4096
    Alignment   8
        Operations:     10
        Time elapsed:   8.74952 ms
        Memory peak:    40960 bytes

FREE LIST
        Operations:     10
        Time elapsed:   0.015632 ms
        Memory peak:    480 bytes

        Operations:     10
        Time elapsed:   0.026728 ms
        Memory peak:    800 bytes

        Operations:     10
        Time elapsed:   0.015414 ms
        Memory peak:    2720 bytes

        Operations:     10
        Time elapsed:   0.017083 ms
        Memory peak:    5280 bytes

        Operations:     10
        Time elapsed:   0.018994 ms
        Memory peak:    10400 bytes

        Operations:     10
        Time elapsed:   0.025057 ms
        Memory peak:    20640 bytes

        Operations:     10
        Time elapsed:   0.037432 ms
        Memory peak:    41120 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       32
    Alignment   8
        Operations:     10
        Time elapsed:   0.024312 ms
        Memory peak:    480 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       64
    Alignment   8
        Operations:     10
        Time elapsed:   0.016451 ms
        Memory peak:    800 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       256
    Alignment   8
        Operations:     10
        Time elapsed:   0.015763 ms
        Memory peak:    2720 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       512
    Alignment   8
        Operations:     10
        Time elapsed:   0.017469 ms
        Memory peak:    5280 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       1024
    Alignment   8
        Operations:     10
        Time elapsed:   0.019485 ms
        Memory peak:    10400 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       2048
    Alignment   8
        Operations:     10
        Time elapsed:   0.024294 ms
        Memory peak:    20640 bytes

BENCHMARK: ALLOCATION/FREE
    Size:       4096
    Alignment   8
        Operations:     10
        Time elapsed:   0.040874 ms
        Memory peak:    41120 bytes

    BENCHMARK: ALLOCATION
        Operations:     10
        Time elapsed:   0.025363 ms
        Memory peak:    5472 bytes

    BENCHMARK: ALLOCATION/FREE
        Operations:     10
        Time elapsed:   0.018909 ms
        Memory peak:    5472 bytes

test environment:

another question, what does 'N operations(100001,etc)' means at this chart? Finally , thanks again for your reply.

mtrebi commented 6 years ago

Hey chenbowalker, that's totally correct! To clarify your doubt, let's stay by the end. N operations is the number of allocations and deallocations that are performed during the test, i.e.: if N operations equals 1, only one allocation and one deallocation will be performed. Now that this is clarified, let's focus on why the C allocator is faster. As I explain in this readme section C allocator turns to be drastically slower when it has to ask the OS for more memory. Before reaching this point, the performance of malloc is pretty nice. The problem is that you can't know in advance when this gonna happens and when it happens, it slow downs the process. In the previous example, since the number of operations (allocations and deallocations) is pretty low, just 10, it is very likely that the benchmark execution on the C allocator is performed before reaching the previously mentioned point. Or, if it is reaching this point, since the memory required to do the allocations is small, it doesn't take the OS too much time. Thus, the performance of the C allocator is better than the Freelist allocator. Nevertheless, I would like to point out that the performance of the Freelist is much more stable as the block size is increased in comparision to the C allocator. Probably because as we need more memory, we need to ask the OS several times for bigger chunks of memory. Take a look at the next table:

Blocksize / Allocator C Allocator Freelist Allocator
32 bytes 0.002520 ms 0.015632 ms
4096 bytes 0.025704 ms 0.037432 ms

As I told you before, you're right when you say that the C allocator is faster. However, look how related are the numbers. While the Freelist allocator takes almost 2.5x more time to allocate a block of 4096 bytes than to allocate a 32 bytes block, the C allocator takes 10x more time. As the number of operations and the block size grows, you can expect this difference to get bigger and bigger.

chenbowalker commented 6 years ago

Thank you very much for your detailed reply. You have completely clarified my confusion. So I am going to do more tests about 'N operations' to discover further secrets. Thanks again!