orium / rpds

Rust persistent data structures
Mozilla Public License 2.0
1.22k stars 57 forks source link

How can a data structure be persisted to a file efficiently? #92

Closed tqwewe closed 4 months ago

tqwewe commented 4 months ago

Thanks for this amazing library.

When I initially read "Persistent Data Structures", I assumed it meant they could be backed by a file so the data is persisted when the app restarts, however it seems like it has a different meaning based on the Wikipedia linked in the docs.

How could I use something like a HashTrieMap which is backed by a file for persistence?

In my particular use case, I have segment files using a crate called commitlog, and I'd like to store some index data along side each segment. I was thinking a HashTrieMap serialized to files along side the segments might work well, however deserializing/serializing the entire index file in one go would take quite a while, especially for larger amounts of data. Should I be implementing this in a different way? A simple persistent HashMap seemed perfect to me, but not sure if there's a better way, thanks.

orium commented 4 months ago

Hi. If you need to frequently store the state of the data structure on disk rpds is not the crate to use: rpds supports serialization with serde, but that will require the entire data structure before storing it.

What you want for that is on-disk data structures or an on-disk database such as leveldb or sqlite.