tweedegolf / sequential-storage

A crate for storing data in flash memory with minimal need for erasing pages
Apache License 2.0
99 stars 10 forks source link

Add an optional cache #5

Closed diondokter closed 8 months ago

diondokter commented 11 months ago

Performance can be poor in some circumstances.

One of the reasons is that every call needs to determine the current state of the flash.

This includes:

This could be sped up quite a bit if some information was kept in a cache in ram

avsaase commented 9 months ago

Do you have an API in mind that would support caching? I was thinking to create StorageController struct that holds the cache. push, peek, etc would then be implemented on this struct, taking &mut self. The struct could also hold the mutable reference to the flash and the flash range so they don't need to be passed to every function. Would these changes make sense to you? And would you be interested in a PR? It would be a pretty big change to the API so I wanted to check before putting in too much work.

diondokter commented 9 months ago

Hey, thanks for showing interest! Yeah I'd be open for a PR. But, I am going to be strict about it! So be warned, I won't accept if I don't like it.

As for the design, I really like the freestanding functions right now. A recent addition for peek_many and pop_many kinda goes against the statelessness of the code. Up until then, all state existed in the flash only. But that's still mostly the case and I want to stick to that.

So what does that mean for cache?

  1. I want the cache to be optional.
  2. I want the freestanding functions to remain.
  3. All state at any point must be in flash (or recoverable from flash). This isn't strictly true right now while in an operation, but it is true between operations. This is something I'm working on improving so future task cancellation or random shutoff.

This has some consequences.

So how do we add cache that isn't a burden to people who don't want it (e.g. for ram size reasons)?

We make a trait. The trait can be used to query information from the flash. For example (feel free to think of better names):

trait StateQuery {
    fn youngest_page(&self, flash: ..., flash_range: ..., ...) -> Result<..., ...>;
    fn last_item_of_page(&self, flash: ..., flash_range: ...,  page_index: usize) -> Result<(ItemHeader, u32), ...>;
}

This trait would have functions for common/slow high-level queries. An implementation for this trait can be made that doesn't do any caching an just queries the flash, just like it does right now. An additional implementation can be made on top of direct-flash implementation that tries to actually cache things. Instead of reading the flash it intercepts the query and it returns some of its cached state if it has any.

How we should inform the cache of a state change is something I don't know. Maybe that should flow from what feels good from the required refactor.

@avsaase does this all make sense a bit?

avsaase commented 9 months ago

Thanks for the detailed description of of you want this to work. I have a few questions:

The requirement to keep the freestanding functions makes is necessary to use some kind of global mutable state for the cache. Do you have a preference for how to handle this in this crate?

What are your thoughts on the expected performance vs RAM usage tradeoff when adding caching? Maybe I'm overlooking something because I don't know that code base in detail but the amount of data that needs to be cached seems minimal whereas the time spent to search for pages with a given state can be substantial. I guess this question boils down to, why should caching be optional? If you want to keep caching optional then that's totally fine of course.

How about putting the optional caching behind a feature instead of using traits? This would reduce the amount of code that needs to be changed and reduces compile time. It wouldn't be possible both use and not use caching inn the same code base but I feel that would be a very nice use case.

How we should inform the cache of a state change is something I don't know. Maybe that should flow from what feels good from the required refactor.

I'm not sure what you mean by this. Do you want the caching system to be resilient to changes to the flash outside of this crate's flash interactions?

I need some time to get familiar with the code base before I can commit to making a PR but I think it would be an interesting challenge.

diondokter commented 9 months ago

The RAM requirements really depends on how much we want to cache. Maybe that should be up to the user to decide.

For example, best performance could be reached if the entire flash content was replicated in ram together with a ledger of all item locations. That's not really realistic I would say, but just the ledger of the item locations is realistic. But that might be too much for some people. Say you've got a 32KB of flash and an average item length of 20 bytes. Say we want to store the address of each item in a u32, then that'd give us ~6.5kb of RAM required to cache it. Not everyone has that.

So I like building systems that are extendible and IMO that can be reached well with using the sort of trait I proposed. This isn't as good if it's behind a feature flag. I guess the compile time will be a little bit longer, but if the user is only using 1 type of caching, then there's only one instantiation of all the functions. So that should be fine.

If you rely too much on it, features will be very annoying. A trait will make sure all functionality always works. There's no path that's less well tested. Also, I want to maintain the relative simplicity of the crate. Required cache makes everything so much more complex because all of the extra state that then has to be kept, even if it's just a little bit. Later there's gonna be a bug somewhere and I will want to be able to get to the bottom of it as fast as possible. If cache is disabled and the bug is still there, then I know the bug stems from the state on the flash itself or some logic around it. If the bug is only there if cache is enabled, then I'll know I'll have to look at the cache.

TLDR

I want to use the type system for extensibility and maintainability. I see a future where I want to provide multiple amounts of caching, including none.

avsaase commented 9 months ago

I added this to my previous post while you were typing your reply so you probably missed it.

The requirement to keep the freestanding functions makes is necessary to use some kind of global mutable state for the cache. Do you have a preference for how to handle this in this crate?

diondokter commented 9 months ago

Ah thanks, I did miss that yes.

The user keeps the cache. We do not keep it for them. This means that, just like the flash, the user gives a reference to the cache in the function call.

So we keep no global state. That wouldn't work anyways, because the crate might be used for multiple regions of flash or even in different flash chips at the same time.

chrysn commented 9 months ago

The global state may not necessarily be global-as-in-linking global, but something the application keeps (per region, if it is using some kind of partitioning). The underlying question to me is how to deal with cache invalidation (yeah, one of those) in the face of having the freestanding functions.

(How) Can we forbid using the freestanding data structures on flash on which a cache is used? Do we need to do this in the type system? (We probably can't, because right now the freestanding functions take a NorFlash and Range, and nothing keeps anyone from mixing those with the wrong kind of storage.) Can and should we state that if there is any cache, it needs to be passed in to all of the functions that interact with some reason, under the same penalties (arbitrary data or even panics?) you get for mixing a map with a queue in the same flash?

Doing some enforcing on the API level may be a good idea, but is likely outside of this issue's scope. (Maybe even outside of this crate's scope, as it'd require a split_at_mut style partitioning from embedded_storage).

It's certainly doable for a cache to always perform some kind of cache validation (probably checking the checksum of the oldest page, and whether anything has been written on the partially open one), but that's a trade-off I'd rather not make (and instead demand that the user be consistent).

diondokter commented 9 months ago

@chrysn Good questions!

I don't think we can enforce it in the typesystem. Mainly for these reasons:

So far the current way has been fine IMO. It's pretty clear you shouldn't mix different datastructures in the same location. But enforcing that cache usage must be consistent is less clear.

In principle the user should be able to go from cache -> no cache and no cache -> new empty cache. What we cannot realistically support is cache -> new/no/different cache -> old cache. I don't think we can stop that the API level. Having the freestanding functions does make it easier to do wrong, I do agree with that, but only a little bit.

Reflecting on this problem I think the best course of action is to simply document it well and require the user to always specify his cache. That means no separate peek and cached_peek. Just peek where you can pass in a NoCache instance. By forcing the user to pass a cache variable every time you force him to think about it.