Open dmaretskyi opened 4 years ago
I think this is a typo:
Those caches other peers.
In such case model either needs to be able to consume snapshots of all of it's previous versions
I think this is a reasonable approach for now.
Nit: "cache" is a term with a specific meaning implying locality of temporal reference. I don't think it really applies to what we're doing here.
persistent cache
The methods should be optional and the stack must only perform snapshots if the exist on the model class.
I'm not sure this is going to be practical: if any model in the party is not snapshot-capable, then no snapshots can be done, which seems problematic.
I'm also thinking about enforcing snapshot method for all models, but an alternative would be to just store the array of mutations for that specific model and replay them on load
TL;DR: good proposal: let's spec the minimal next step experiment...
BUT FIRST DESIGN THE TEST SO THAT WE KNOW WHAT THE IMPACT IS.
PartyManager.open
should discover existing snapshots and use them before processing feed messages
Why?
Model snapshots will allow us to store the model state in serializable format and then restore the runtime model from that saved state. This will significantly improve startup performance for large databases.
Current plan
Snaphotting will be performed by serializing a party state as a whole and saving it into a persistent cache. Those caches can be shared with other peers. Upon app restart the stack will discover the snapshots and create models already containing the deserialized state. All of the mutations that were included in the snapshot will be skipped for processing.
Model API
Add two additional methods for models:
createSnapshot(): Json
returns a JSON-serializable object with the current model state. This snapshot must include all of the necessary data to fully restore the model state in the future.restoreFromSnapshot(snapshot: Json)
replaces the current state with the one from provided snapshot.The methods should be optional and the stack must only perform snapshots if the exist on the model class. In the case when model doesn't support snapshots, stack will revert to storing an array of mutations for that model and replaying them on load.
For the first iteration we serialize snapshots as JSON data. In the future we might consider using protobuf encoding.
Versioning
Model versioning might be required if snapshot structure changes between versioning. In such case model either needs to be able to consume snapshots of all of it's previous versions or have a fallback mechanism so the stack can perform a full recalculation of the model state based on feed messages.
TODO
Snapshot timestamping
Each snapshot will be assigned a timeframe-timestamp (mapping of feed key => seq number). That timeframe will signify which feed messages are already included in the snapshot. Upon refresh those mutations will be skipped.
The stack will automatically determine the time intervals to perform snapshots (every 1000 feed messages maybe?).
Performance considerations
The most notable performance improvement will be in allowing the stack to skip reading feed messages that are already included in the snapshot. This way only the most-recent mutations will
Also when a new peer joins a party, he might get a full snapshot of a party state from other peers.
We should also consider the time it takes to save the full party state, as we cannot do any message processing while saving not to corrupt the snapshot. With large databases such a pause in processing will be a ux issue.
For that reason we should consider doing incremental snapshots where only the changed objects will be serialized.