espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.78k stars 7.31k forks source link

ESP32s3 heap capability allocator leads to memory fragmentation. (IDFGH-12587) #13588

Open PaulMartinsen opened 7 months ago

PaulMartinsen commented 7 months ago

Answers checklist.

General issue report

Please rethink the capability heap's allocation strategy.

Currently malloc appears to allocate from the largest heap first. This led to a fragmentation related allocation fault. It would seem better to place smaller objects in smaller heaps first by default. I think the largest heap is just the piece of contiguous memory left after statically scoped variables are placed by the complier.

Details

Yesterday I ran out of heap memory. More precisely, I had about 100k free of MALLOC_CAP_8BIT capable memory, needed 36k but the largest free block was only 34k. I was sad.

It appears the ESP32s3 (and probably other devices):

  1. has five heaps: 1 in PSRAM, 3 in internal RAM (two of which appear equivalent) and 1 in RTC fast RAM.
  2. the internal RAM heaps were roughly 200k, 20k and 32k. The 200k and 20k are in SRAM1 while the 32k one is in SRAM2.
  3. malloc fills up the 200k ram first. This is why I was unable to obtain a 36k block of RAM.

For reference, here's a summary of my memory disposition: image

Questions

  1. Is it possible, using heap_caps_malloc to place an object in heap 3 or 4 in the diagram above? I can't see any combination of MALLOC_CAP_* flags that would let me do this before heap 2 was too small for the allocation.
  2. Is there any documentation on replacing the default heap strategy with a custom one? I haven't been able to find it yet, unfortunately.
  3. Is it possible for heap 2 and heap 3 to be combined into a single heap? Its not clear to me why these show up as two separate heaps. I saw some kind of post-init cleanup/heap gathering in the source. Is it related to this?
SoucheSouche commented 7 months ago

Hi @PaulMartinsen, thanks for reaching out to us concerning this problem. Let me first address your questions.

  1. It should be possible to allocate from either heaap2/3 or heap4 based on the CAPS from the table you shared. But unfortunately choosing between heap 2 and heap 3 is not.
  2. you can't currently replace the heap allocation strategy but you can create a new heap at runtime in e.g., heap2 and then allocate in your newly created heap region. For that you can use heap_caps_add_region_with_caps() in esp_heap_caps_init.h. That can be a workaround that gives you the choice of where to allocate.
  3. Upon startup a part of the memory has to be reserved because it is used as a stack. After the system has started, this reserved area is no longer needed and can be then used as heap. The currently implemented solution is to create heaps where possible upon startup and leave out those reserved area. When those areas are no longer needed as stacks, the heap component creates a new heap in place of those reserved areas. That is why you can observe that the RAM memory region is split into 2 different heap even though it could technically be covered by 1 heap only.

I am currently working on an alternative approach to the startup stack problem that will result in creating a single heap per memory region. That will solve the issue you are facing at the moment.

PaulMartinsen commented 7 months ago

Thanks @SoucheSouche for your quick and helpful comments. Particularly for clarifying where heap 3 comes from. Your new approach to join heaps 2 & 3 sounds great and a lot better than allocating into heap 3 explicitly (which would be gone with the new approach). I think this will be a big step forward for us.

When I enabled support for malloc to use PSRAM (possibly a temporary work-around), I noticed there's an option to reserve a section of internal ram that can be used by the capability allocator and is not used by malloc. This does go into heap 2/3. I don't think we will need this, but I did wonder if having an option to place this in "heap 4" would be straight forward and worthwhile.

Thanks again for your comments and work on the memory management.

SoucheSouche commented 7 months ago

@PaulMartinsen, no problem. we can leave this ticket open and I will link it to the merge request for this heap initialization update so it gets closed when the new implementation gets merged in master.

The modifications are quite big so it might take a little while before I can merge.

JamieDriver commented 5 months ago

Also seeing similar issues - unable to start camera because it wants to make a ~30kb dma-capable allocation, but largest single block of dram is only ~28kb, even though >80kb free dram.

esp32 classic does not have this problem afaics because the smaller heaps are first, so get used first, leaving most of the final (largest) heap free.
esp32s3 has the largest heap first, so that gets (partially) used, leaving part of it, and the smaller heaps, free - but no single large block available.

Reordering the [equivalent DRAM] heaps such that the smaller ones are searched/used before the larger would be a huge win, I'm sure.

AxelLin commented 4 months ago

@SoucheSouche How is the status of this fix? https://github.com/espressif/tlsf/pull/2

AxelLin commented 2 months ago

@SoucheSouche How is the status of this fix? espressif/tlsf#2

@SoucheSouche When will esp-idf update tlsf submodule to get above fix? Will you update tlsf for release branches as well?

SoucheSouche commented 2 months ago

@AxelLin, the MR to update the tlsf submodule is created and under review now. No backport can be done for https://github.com/espressif/tlsf/pull/2 because backporting implies patching the ROM version of the TLSF and since this PR updates static inline functions, it simply can't be done.

@PaulMartinsen the new implementation of the heap init process is stalled because of ROM TLSF backport issue as well. In the mean time, I will update the heap init to sort the heaps by increasing size to fix your issue and backport this fix.

SoucheSouche commented 1 month ago

@PaulMartinsen, I created a fix that sorts the heap by increasing size. It is currently under review.

AxelLin commented 1 month ago

@AxelLin, the MR to update the tlsf submodule is created and under review now. No backport can be done for espressif/tlsf#2 because backporting implies patching the ROM version of the TLSF and since this PR updates static inline functions, it simply can't be done.

@SoucheSouche I don't quite understand this.

I notice the v5.4 branch includes the fix espressif/tlsf#2 . Now I'm using v5.2 branch. If rebuild my application using v5.4 branch and OTA firmware on the device, will it work?

AxelLin commented 2 weeks ago

@PaulMartinsen, I created a fix that sorts the heap by increasing size. It is currently under review.

@SoucheSouche How is the status now? Could you share the commit for this fix?

SoucheSouche commented 2 weeks ago

@AxelLin, the fix has been reviewed and will be merged soon. The commit will be made available when the work is merged and synced with GH.