arximboldi / immer

Postmodern immutable and persistent data structures for C++ — value semantics at scale
https://sinusoid.es/immer
Boost Software License 1.0
2.49k stars 179 forks source link

Serialization #44

Closed stefan-pdx closed 3 years ago

stefan-pdx commented 6 years ago

Hi @arximboldi!

Thanks for authoring this library! I was curious if you've thought of any approaches or patterns for serialization. I have a use case where I would love to build an immutable data structure that is serializable via a library like Cap'n Proto (👋 @kentonv) so that I can access larger datasets via memory-mapped files. Do you have any immediate thoughts or ideas on how immer could serialize its data efficiently?

Cheers!

arximboldi commented 6 years ago

I have never used Cap'n Proto, but I guess that in order to be able to store the data-structures in memory-mapped files, a special allocator strategy would be needed. However, I am not sure if the way pointers are stored in the data-structure suffices. To be sure, I'd need to fully implement the std::allocator API, something that is already suggested in other issues (#14) and is discussed in the questions of the CppCon talk (https://www.youtube.com/watch?v=sPhpelUfu8Q).

I think that another interesting thing would be to have an API that would allow serializing a set if objects while maintaining the structural sharing. This would allow one to implement persistent undo using the trivial undo (just a collection of state objects) for which I advocate in the forementioned CppCon talk.

Are there any other ideas that you may have in mind?

darkforestzero commented 5 years ago

Hi, my company is considering using your library for storing game state, but we'd need to be able to serialize while maintaining the undo/redo stack. We are curious if you have thought about possible implementations and if you have any suggestions. Thanks

arximboldi commented 5 years ago

Hi @darkforestzero!

I find interesting that you are using it for games! Depending on the type of game this might or might not be the best approach. For highly dynamic games (like FPS) immutable data-structures provide limited benefits due to the inherent lack of structural sharing, however, it can work great for the turn based games, table games, and such where there are discrete, localized, changes.

This being said, if you wanna persist an infinite undo history, there are two options.

The first one, the hardest one, would be to modify the data-structure such that a full set of values is persisted in a single file, with the file matching the internal data-structure sharing. This would need to modify immer itself.

The second one is simpler, and actually might have other benefits, but it depends on how you architect your software. If you use a redux-like architecture (see my talks about [1][2]) you have an action value type and a pure model update(model, action) function, right? Since the function is pure, you can just quickly evaluate it to get an updated state. So instead of storing every single state, you store just a few states from your undo history (you do the serialization using the external API the way you would normally do it), but in between snapshots, you serialize every actions that happened! (Normally actions are small.) When you want to go back to a state, you check if you have a snapshot ready, if you don't, you can read the last available snapshot and generate the in-between states by applying the actions. This is a similar principle to the "event sourcing" architecture.

Is this clear, does this help?

[1] https://www.youtube.com/watch?v=y_m0ce1rzRI [2] https://www.youtube.com/watch?v=_oBx_NbLghY

ruler501 commented 4 years ago

Sorry to bring up an old issue but something like a way to serialize and parse from json with pointers (rfc 6901) would be really helpful. I'm looking at building a document format for a few applications off of something like persistent data structures to ensure versioning and undo history are persisted. Optimally I'd like to be able to have a potentially optional human readable header with the current version (straight json representation of the "mutable view" of the object) while maintaining the ability to version and potentially handle things like branches down the line. If there is a way to do something similar currently or could be easily added I'd greatly appreciate it. Otherwise if I could get some pointers on how to start integrating something like that I should be able to put some time into it.

arximboldi commented 4 years ago

@ruler501 that is indeed an interesting feature, but there are lots of questions from the API point of view. If this is for a commercial project I'd be happy to offer some consulting!

stefan-pdx commented 3 years ago

Hello!

It's been a while since I originally opened this issue. Looking back, I've realized that my understanding of the problem has evolved quite a bit -- along with the C++ language. Now that C++20 has std::pmr::monotonic_buffer_resource (not implemented in Clang yet), I think the original problem of serialization can be solved by leveraging Immer's policy-based design to allocate memory inside of a monotonic_buffer_resource. This would solve my original use-case.

I'm happy to close this issue to keep the project's issue count low. :)

Thanks for this great software library, the engaging CppCon talk, and thoughtful paper!