don't allocate large block with MEM_TOP_DOWN by default

gabr42 commented 5 years ago

As it turns out (analysis here: https://www.thedelphigeek.com/2019/04/fastmm4-large-memory.html), allocating memory blocks with MEM_TOP_DOWN can bring in 5-15% penalty.

I'm proposing adding conditional symbol AllocateLargeBlocksTopDown. If it is defined, large blocks are allocated with MEM_TOP_DOWN, otherwise they are not.

By default, AllocateLargeBlocksTopDown should not be defined.

pleriche commented 5 years ago

Hi Primoz,

Apologies for the long delay in responding. Things have been crazy busy.

With regards to MEM_TOP_DOWN being used for large blocks: The peformance penalty is a known factor, but there are benefits to allocating large blocks from the top down that outweigh the performance cost - at least in the tests I have done.

Segregating large and medium block pools at the two ends of the address space reduces address space fragmentation: Medium block pools are of a fixed size, so when a medium block pool is freed the space it occupied will most likely be reusable for the next medium block pool that is allocated. If large blocks were allowed to intersperse with medium blocks then fragmentation would increase. Additionally, the "segmented large block" feature would not work well if large blocks were interspersed with medium block pools.

This design decision was made before the 64-bit version was added, (under which address space fragmentation is less of a concern), but the segmented large block feature would still be less effective.

Best regards, Pierre

gabr42 commented 5 years ago

Given the "segmented large block" problem, it is probably best to discard this pull request. Thanks for the explanation!

Dave-Novo commented 5 years ago

I think that leaving it as a compiler directive, with proper explanation of the pros and cons is still better than leaving as is. I am not sure how Pierre quantified the penalty induced by memory fragmentation vs the speed penalty of memory allocation with MEM_TOP_DOWN, but I am sure that it is situationally dependent. For those of us that do a lot of large block allocation, the speed benefit may outweigh any excess fragmentation. Fragmentation needs to be assessed in an typical usage scenario for a specific application so having the compiler define there at least allows us to easily do this profiling

baka0815 commented 5 years ago

@gabr42 I would second @Dave-Novo. Please include the compile switch in the include file (see my comment above) and disable it by default. That way one can choose if the speed benefits weighs more than the possible fragmentation.

pleriche / FastMM4

don't allocate large block with MEM_TOP_DOWN by default #75