`store_item` can result in excessive stack allocations

tweedegolf / sequential-storage

A crate for storing data in flash memory with minimal need for erasing pages

Apache License 2.0

97 stars 9 forks source link

`store_item` can result in excessive stack allocations #4

Closed ryan-summers closed 9 months ago

ryan-summers commented 10 months ago

Because of the algorithm used for buffering a to-be-erased block, this means that an entire flash block is buffered on the stack. For the STM32H7 series, each flash block is 128KB (8 sectors total, 1MB flash), which is a prohibitively large stack allocation.

Some possible work-arounds:

Accept a user-provided buffer for storage instead, allowing the buffer to either be statically allocated (i.e. from some memory location not on the stack)
Modify the algorithm used to accept an N-sized buffer, where we scan the to-be-erased block for elements that only exist in it, and then store them in the buffer. That way, we likely do not need to store the full block. If the user-provided buffer is insufficient, return an error.

Thoughts here @diondokter ?

diondokter commented 10 months ago

1 is pretty easy to do. And yeah 128kb is excessive! It was meant for chips with like 4k blocks. Still doesn't really solve the problem that it uses a lot of memory, it only moves it off the stack.

2 would be nicer, but there's a concern. There's a reason the algo is recursive. The erased page could be full of unique items that take up the full page. So if you erase it, and then add everything back again, you're in the same situation as before except that it's now no longer the oldest page, but the newest. But you're still left with wanting to add the new item and so you have to continue with the next page. (which can have that problem again).

2 could work, but it'd be unclear how much smaller than a page you could make it. That would really depend on how many different keys there are (and how big their corresponding values are) and some randomness.

diondokter commented 10 months ago

I guess there's an option 3 as well. And additional page could be left open at all times as well. That way you'd just transfer items to the new page directly one by one instead of buffering it all first...

ryan-summers commented 10 months ago

I guess there's an option 3 as well. And additional page could be left open at all times as well. That way you'd just transfer items to the new page directly one by one instead of buffering it all first...

This is an interesting one as well. I.e. use a spare flash block as the working space when erasing the oldest buffer - I like this idea personally, as I think it's the simplest and creates a non-breaking API change.

diondokter commented 10 months ago

3 does effectively require a minimum page count of 3 and loses you some storage space. I'd consider that a breaking change. But it'd be easy to upgrade to.

ryan-summers commented 10 months ago

Having 3 flash pages would already imply that your flash pages are massive, which means that the existing architecture likely wouldn't support it anyways (since it wouldn't fit on the stack).

diondokter commented 10 months ago

That's true yeah

diondokter commented 9 months ago

I implemented option 3 in the linked PR. Now released as 0.5.0

ryan-summers commented 9 months ago

Wonderful, thanks! I'll give this a try now, as I think that was the major blocker for me. I tried rearranging our memory usage, but it was causing perfomance issues in other areas because the new stack ram was slower.