ssbc / ssb2-discussion-forum

not quite tiny, also not quite large
17 stars 1 forks source link

Bounded everything #19

Open staltz opened 1 year ago

staltz commented 1 year ago

SSB has so far operated with some assumptions that some resources (or metrics) are unbounded:

Let me talk about the second. I want to provide offline tolerance, i.e. the ability of including messages that have been authored completely offline and left un-replicated for months or years (there is no bounding, so this could be decades or centuries!) as always valid and acceptable. But maybe it shouldn't be unbounded.

Unbounded offline tolerance is a tough requirement to satisfy, especially as we consider tangles (same meaning as "DAG").


Take for instance the following scenario:

A tangle always starts with a single root. This is good because that single root acts as a point of consensus. First of all the root's ID identifies the whole tangle, and it is also the point of "no more verification". Once you start verifying messages, and you hit the root, you don't have to verify anything before the root, because there is nothing.

So the root is special.

However, if we want to support sliding-window deletes (a.k.a. sliced replication #18), then we need to delete a large portion of the tangle. And then we will have to discover the "new root", which is a single node that serves as the point of consensus.

graph RL
A["A (root)"]
E-->C & D
C-->B-->A
D-->A
F-->D
G-->F
H-->F
K-->H & G & E
L["L (new root)"]-->K
M & N-->L

A:::rib
L:::rib

classDef default fill:#6df,stroke:#fff0,color:#000
classDef rib fill:#000,stroke:#fff0,color:#fff

So we could delete everything before L. But consider the following problem: someone has been offline since 1993 and they suddenly pop back into and replicate a very old message, Q:

graph RL
A["A (root)"]
Q-->B
E-->C & D
C-->B-->A
D-->A
F-->D
G-->F
H-->F
K-->H & G & E
L["L (new root)"]-->K
M & N-->L

Q:::crit
A:::rib
L:::rib

classDef default fill:#6df,stroke:#fff0,color:#000
classDef rib fill:#000,stroke:#fff0,color:#fff
classDef crit fill:#d00,stroke:#fff0,color:#fff

:fire: We have to find some kind of measurement for how much "offline" Q has been, and based on that measurement compared to a threshold, reject Q as invalid. In other words, offline tolerance should be bounded.

Having the threshold allows us to reasonably (and hopefully deterministically) arrive at a consensus for a new root by construction. Then all peers should disregard these "appendages" (tangle nodes that are not "causally preceding" the new root) as invalid.

The challenge is coming up with this metric, somehow. It makes sense that offline tolerance is measured in days, e.g. "the maximum a peer is allowed to be offline is 6 months", BUT in tangles we can only truly rely on backlinks (causal order), not timestamps.

Some tangles will move faster than others (e.g. a hot-topic discussion thread will grow quicker than a multi-device tangle for your profile updates) so we can't rely on counting the nodes in a tangle.


Zooming out, I think it's important to drop the tough-to-comply "unbounded everything" in SSB and begin to consider "budgets" and "thresholds". E.g. the 64MB budget in memdb allows me to make strong guarantees on query times. And similarly an "offline budget" would allow us to make strong guarantees on the sliding window replication with deletes.