Open staltz opened 1 year ago
The way I see it is there are 4 options:
Can you explain better the second option, and how it would function in the presence of forks?
Sure. Fundamental is that timestamps must always be increasing. A fork is any case where you receive a message with either sequence the same as your current head, or less. Here I think of forks in the general case, so it could either be recovery or another device that does not have the latest message before it posted a message. I don't think forks are really that big of a problem. Either your application model for the conflicts are: last writer wins and this is easy because of the timestamps or if they are more complex, then you must rely on the application specific tangles anyway. Tangles are even more general than multiple previous because they can handle multiple authors.
You know more or less my point of view. I think the generic "all is tangle" is better. We need anyway to find an efficient replication of tangles. So what appear more complex initially may reveal easier at the end because we end with a single replication logic optimized for tangles and not 2, one for feed and one for tangle.
@arj03 I see what you mean, thanks. Forks aren't hard to deal with IF we accept that we will lose some content. E.g. if the losing fork has a long post that I took 30min to write, I might get quite sad if the winning fork "erased" it. So I think some kind of CRDT system that creates merges and preserves content in the forks is better. What you describe is kind of like (pardon me for this comparison) blockchains, because forks can always occur there but the forks are ignored/pruned and the branch with the most "proof of work" wins.
There's still the alternative that if we go with your idea, then we could "copy-paste" the losing fork's content onto new messages on the winning fork. Kind of like a git cherry-pick.
That said, I do like the idea of single replication logic for tangles, as @gpicron said.
last writer wins and this is easy because of the timestamps
I'm thinking about your suggestion @arj03 and it seems there is a way it could break in EBT. That's because multiple peers may have different forks of the feed, and if you compare fork A and fork B, A might win over B, but if you compare fork A and fork C, C might win. So one of the peers chooses to continue appending on C while another peer continues to append on A, and this fork isn't resolved.
I think the problem is that at no point does a peer ever get two forks locally in order to compare the forks. What happens instead is that if you inform you have everything up until sequence 10, nobody will send you an alternative sequence 10, they will just send you 11 and so forth. So you end up never downloading the fork, and thus you can't compare, you don't know if the fork would be a winner.
Unless we implement something else than EBT replication...
Another comment: for SSB2 I think we will need tangle sync anyway, because we're not going to do hops 2+ replication, we're just going to have:
I'm not saying that we should drop your sequence
number (and previousless msgs) idea, I think we should consider all the options. But it seems clear that we have to invest in tangle sync, how to implement it etc.
Yep, for me EBT compatibility was not a goal, because it is complex. For what is its supposed to deal with: replicating feeds from start to finish it works really well, but it's very hard to do anything else with it. And yes agree on tangle sync.
I was close to arguing for the tangle backlinks solution until i considered encrypted messages. With this in mind I'm leaning towards the current solution with previous array.
Can you explain the problem with encrypted messages ?
If you put membership tangle information or just thread tangle info in the outer layer then you leak information. And if you start encrypting parts of the tangle info then it also gets more complicated. So I think the cleanest solution is to just have previous related to author/feed.
@arj03 in the bundle structure proposed in https://github.com/ssbc/minibutt-spec/issues/10#issuecomment-1455207795 There is no problem to have one Chaining Block in clear text and one encrypted.
If you put membership tangle information or just thread tangle info in the outer layer then you leak information. And if you start encrypting parts of the tangle info then it also gets more complicated. So I think the cleanest solution is to just have previous related to author/feed.
I think the subject of private messages and groups need a specific threads. I'll start a new issue.
I said in #8:
I think we should rethink what the validation logic for "previous" should mean in minibutt.
Previous validation in ssb-classic does just ONE thing: it takes the previous message (from db2 base or state), calculates its msg ID, and matches with
newMsg.value.previous
. Nothing else.So in essence it is just checking whether the new message points to the latest head as its previous. In other words, classic validation wants to rule out forking off of an older message in the feed.
But forking off of an older message in the feed is a feature in minibutt!
Which leads us to the question: what would consist of an invalid previous field in minibutt? I can think of:
All these are rather unlikely to happen unless there is a buggy implementation. I don't know why a peer would lie about their own feed structure, bugging it on purpose. Perhaps to prevent other peers from crashing on invalid types, we should do basic validation that the previous ID "seems" correct.
But due to sliced replication and deletions, we can't actually fetch the previous msg from disk and do a real check.