File buffer on heap - Githubissues

khassounah commented 5 years ago

Hello. Please consider merging this pull request.

When serving very large files, or a large number of files, the current implementation experiences stack overflow failures. This change allocates the needed buffer on the heap rather than on the stack, and only allocates the buffer if/when needed right before the read and send operation. The change made it possible for me to serve significantly larger files (tested up to 15 2MB files being served simultaneously, which consistently failed before the change).

chmorgan commented 5 years ago

@khassounah how is this fixing a stack overflow issue? The byte array is on the stack but only allocated once in the get call. Can you provide more details?

khassounah commented 5 years ago

@chmorgan the stack space available is much smaller than heap. Depending on the call stack and what has been allocated in the task, this was failing quite consistently. While allocating 1k on the stack might work in certain cases, it is close to 25% of the heap (and 50% if the default esp-idf configuration is not changed).

tidklaas commented 5 years ago

Allocating 2kB on the stack (1kB for data buffer, 1kB for file path buffer) does seem to be a bit excessive. Maybe these should be changed to more reasonable values, say 512 and 128 bytes.

Dynamic allocation, on the other hand, introduces some additional problems. How do you handle a malloc() failure? Retry? How many times? Just close the connection? Then how will the receiver know that the file transfer has been truncated?

Besides, constantly allocating and releasing memory might lead to heap fragmentation, although that depends mainly on the allocation algorithm used by the system.

chmorgan commented 5 years ago

Let me review the code. I spent a lot of time, several days, going through all memory allocations in the system to remove extra ones, shift some to heap allocated structures etc.

I’ll try to check today sometime.

Chris

On Thu, Jun 20, 2019 at 5:21 AM Tido Klaassen notifications@github.com wrote:

Allocating 2kB on the stack (1kB for data buffer, 1kB for file path buffer) does seem to be a bit excessive. Maybe these should be changed to more reasonable values, say 512 and 128 bytes.

Dynamic allocation, on the other hand, introduces some additional problems. How do you handle a malloc() failure? Retry? How many times? Just close the connection? Then how will the receiver know that the file transfer has been truncated?

Besides, constantly allocating and releasing memory might lead to heap fragmentation, although that depends mainly on the allocation algorithm used by the system.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/chmorgan/libesphttpd/pull/69?email_source=notifications&email_token=AAJH4AFUQQJYR4356GGVZLLP3NDYNA5CNFSM4HZGC3W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYE2UFQ#issuecomment-503949846, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJH4ACLOF2UXDZFX3ST4QLP3NDYNANCNFSM4HZGC3WQ .

khassounah commented 5 years ago

@tidklaas You make a good point. The challenge with making it smaller is that it multiplies the number of flash reads/network sends by the same factor, and from my testing it will get much slower for small files and extremely slow for large files (also increases the number of open file descriptors and connections that causes other issues).

But you make a good point. Recovering from a failed malloc can be done by simply closing the connection (which will free up resources) and might save the system. A stack overflow is practically unrecoverable and currently causes the esp32 to crash (panic). I am happy to revise the code to add malloc failure detection and recover properly.

@chmorgan I could see that, and increasing the task stack size also did seem to help. The challenge is with larger files (e.g. 1MB that requires a thousand cycles) or when simultaneous files to be sent (typical web application).

I can keep the change in my fork, just thought it did make libesphttpd more robust.

chmorgan commented 5 years ago

@khassounah, I'm not following how larger files affects this situation at all. Consider that there is a single thread servicing all open connections. If a very large file is requested you'll end with a number of calls to cgiEspVfsGet() but no more or less than if that buffer was allocated on the heap vs. the stack. At the same time because we have a single thread you'll only have a single call to cgiEspVfsGet so at most you'll only add that 1000 bytes of stack space when you enter that function and it is popped off when that function returns.

Imo you'll not only have to deal with heap fragmentation and the time it takes to allocate from the heap, but you'll also have to deal with handling an out of memory case. The design of the library is such that given a sufficient stack space and allocating buffers statically that the library doesn't do much (maybe no) dynamic memory allocation. So I'm guessing you'll be faster (fewer cpu cycles and less time) in the case of using the stack vs. heap for large files.

Are you seeing different behavior or am I misinterpreting the code?

khassounah commented 5 years ago

@chmorgan, you make a good point, the size of the file should not affect stack usage. So I did a little bit of digging, and here's what I found: in the httpd task, the stack high water mark reaches 20 bytes fairly consistently, even with 3KB available for the task's stack. A solution could be to increase the stack size, say by 1KB (even beyond what you've done in your example project, which is already my configuration), the risk there is that with 10+ tasks that we're allocating 10+ of additional memory to stacks to meet 1KB need of a single task. I felt that allocating the memory on the heap might be a good compromise, especially given that fragmentation is less likely with the tight allocate-use-free logic (feel free to challenge me on that, it's not fully thought through).

You're the upstream owner, so it's your call at the end of the day. But I am serving a relatively large web application (angularJS) and have not been able to get the esp32 to serve it reliably without moving the allocation to the heap. Whatever your decision, I remain amazed and grateful for the effort you put into this lib.

chmorgan commented 5 years ago

@khassounah why would you increase the stack size of all of your stacks when only the libesphttpd stack needs the additional memory? Imo if you added 2k to your libesphttpd stack you'd be in great shape and could both avoid forking AND avoid any performance loss due to heap allocation.

If there is a good reason to heap allocate it there isn't any resistance to doing so. As I mentioned before I've rewritten almost all of the libesphttpd memory allocation to use static buffers or stack allocation to ensure that libesphttpd won't run out of memory at run-time and that you can determine the memory footprint at compile time. Of course you can heap allocate the connection pool and the stack (if you aren't using spiram).

chmorgan commented 5 years ago

I'm closing this one out because while I can appreciate the issue it doesn't make sense. @khassounah if you'd like to provide more info and reopen that works for me.

khassounah commented 5 years ago

@chmorgan I think this pull request was born out of my inexperience with FreeRTOS and not knowing that I can control stack size on the task level. Closing it is the right decision. Thank you.

chmorgan / libesphttpd

File buffer on heap #69