hickey / meshchat

MeshChat for AREDN (in Lua)
GNU General Public License v3.0
21 stars 3 forks source link

[Notes] Message synchronization improvements #17

Open hickey opened 11 months ago

hickey commented 11 months ago

As shown by #16 there needs to be improvements to the synchronization of the message DB. Most specifically there needs to be a way to delete/purge a message from all nodes in a zone.

Another improvement should be to instead of transfer the entire message DB to be synchronized, that only the messages that have been added to the remote message DB be transferred.

gerner commented 8 months ago

What's the use case for deleting a message? I mean what's the motivation for wanting to spend time on it?

Maybe you've already thought of it, but a common technique in distributed systems for something like that is to have a tombstone record that references the original message. Seeing that means folks should ignore (perhaps zero out content if possible?) of the now dead message. This solves the problem of nodes re-syncing the dead message and it popping up again.

In terms of partial syncs, I was thinking we could have a time range for syncing messages. So when a node first comes up, it asks other nodes for the most recent messages only, and then iteratively asks for older and older messages until it gets everything. Maybe it's a request like, "give me the 32 most recent messages older than x1 time" and then ask for the next 32 messages using the timestamp of it's oldest message the cutoff, and so on.

And when asking for the latest messages it can similarly ask for those and if the oldest "new" message is newer than its previous newest message there might be a gap and it can ask to fill in the missing parts.

I've built a few systems that scrape content from APIs that use a similar strategy to get everything in an idempotent, incremental fashion with some success.

Those are my thoughts. I know I haven't been too involved to date, but I'd like to get more involved. Especially if you've solved the dev problem of running on linux outside of a wifi router :)

hickey commented 8 months ago

My preliminary thoughts are that each message needs to have an ID that is unique across the entire message DB. Whether that is a "timestamp" or serial number + hostname or UUID I am not sure (there is some sort of simple message ID currently that I need to look at to see if it is sufficient). This way a list of message IDs can be easily transmitted, compared and messages can be transferred (either singularly or batched up). This would also make the delete of a message achievable.

This sort of change will end up being a major bump in the versions. Depending how this is implemented, I am hoping that compatibility with v2.x will be maintained for syncing of messages. This way not all the servers need to be upgraded at the same time.

There is an older sync standard called SyncML that I want to revisit and see if there are any advantages to implementing an existing protocol.