ssbc / epidemic-broadcast-trees

bandwidth efficient broadcast gossip
MIT License
122 stars 20 forks source link

Is there a way to reset a remote clock #34

Closed arj03 closed 3 years ago

arj03 commented 4 years ago

In the case where one nukes the local db and needs to resync.

cryptix commented 3 years ago

+1, another reason would be unblocking a feed and it would be nice to not break the protocol too much.

Maybe we could just pick a special integer value? -1 is don't replicate so there already is precedence.

arj03 commented 3 years ago

I think the way to go about this would be as you say just define a value and write tests + fix the implemention for that.

mtalexan commented 3 years ago

@arj03

I found another paper on the same topic to be more explanatory with respect to the vector clock (2 sections from the same paper https://github.com/dominictarr/scalable-secure-scuttlebutt/blob/master/paper.md#append-only-gossip-scuttlebutt and https://github.com/dominictarr/scalable-secure-scuttlebutt/blob/master/paper.md#append-only-gossip-with-request-skipping).

The purpose of the vector clock is to advertise what you already have, and then get a response from the remote about what it has from the list of IDs you advertised. If you cache what they responded with, and they cache what you advertised, both sides can assume the list of IDs is unchanged if parts of the list are later left out of future advertisements.

So if you lose your database, you would no longer have a cached state for the remote. When you look at your current internal vector clock (your absolute state), you would have a list of IDs that all list sequence number 0, and no cached state for the remote. Digging to the empty remote cache would result in you sending your entire list of IDs in the advertised vector clock.

The remote when it received your list of IDs set to sequence number 0 will update its cached state about you by filtering the list of requested IDs to only those IDs it also tracks, then updates the sequence numbers in its cached state of you for those IDs. Notably, if you somehow dropped an ID during the database reset, the remote will believe your current state includes your last reported state from before the database reset for those IDs [1].

The remote responds to your request by replying with: all the IDs you requested that it ignores (and therefore knows nothing about) with theIGNORED value, the current state it has for all IDs in your request, and the current state of all IDs in its cached state of you that weren't in your request but that it has newer info about.

When you receive this state info from the remote, you now know what not to ask it about again (IGNORED), and what it has newer/any i fo about. Both sides can then follow up with requests for the messages they want in order to update their internal state to the latest of the IDs they both care about. For you it would be all the messages for all the IDs you requested that the remote has any info about, plus any IDs you didn't request but have previously requested that were updated since your pre-database-wipe request to the same remote.

Am I misunderstanding your question here or you think I'm misunderstanding the protocol?

1 - The vector clock mechanism is not how you recover your list of tracked IDs, those are rebuilt using the history of your own messages which include follow and ignore/block events for everyone you ever followed or ignored. Normally you therefore wouldn't have a case where you've lost some IDs from your vector clock, except for the corner case where you're attempting to rebuild your own message history from your peers.

mtalexan commented 3 years ago

@cryptix You make a good point about changing to ignore an ID you previously followed. Because the IGNORED value is only allowed in the vector clock response, it seems like there's no way to avoid being told about the state of IDs you once requested but now don't care about. There's no way to indicate a negative cache request for an ID in the advertisement or a cache flush in the request. So if you ever asked about an ID in your advertisement to a remote, it's in the remote's cache of your state and subsequent failures to include it are assumed to be for efficiency not as a negative cache request.

Caveat: I haven't examined the code for any clients to know if there's some deviation from the documentation, or if there's some other message/message flag to request a full cache replacement.

arj03 commented 3 years ago

@mtalexan thanks for digging up that paper. Yes it probably should work that way, but I don't think the currenly implementation does :) Needs to add a test for that first of all.

arj03 commented 3 years ago

This now works. The idea is that you make sure you request your own feed, then the remote peer will send you all your messages and from there you can request the other feeds. It also works in the case where you start from a backup instead of from an empty database.

cryptix commented 3 years ago

Very nice writeup, thanks @mtalexan! It actually prompted me to double check and find an issue in my go implementation.

Both sides can then follow up with requests for the messages they want in order to update their internal state to the latest of the IDs they both care about.

small well actually since you were asking about differences between papers and implementation. In ssb-ebt, there are no additional requests Even though the system could fallback to it's single feed createHistoryStream calls, when in EBT-mode it eagerly pushes all the feeds messages by sending rx:true on the note of that feed in the vector clock updates.