Validation and Verification

AljoschaMeyer commented 1 year ago

Since there is a lot of discussion about including or minimizing backlinks: What does "validation/verification" mean and which purpose does it serve? I've seen "validation of messages", "validation of logs", "validation of tangles", none of which maps to how I think about validation ("Given two messages, was one provably created before the other?"). I feel like clarifying this could help in focussing discussions and in evaluating proposals.

staltz commented 1 year ago

I'm probably more sloppy with these technical terms, so let me try to make it more accurate.

Validation: given a tangle, and a new message that may or may not belong to that tangle, validation is the process of checking that the signature of a message is valid, and that its backlink(s) only refer to messages currently belonging to that tangle. And the message must be provable to have causally happened after the genesis/root message in the tangle. I.e. we don't allow a tangle to be a disconnected graph. There can be multiple tangles, but each tangle is a connected graph. A new message that would cause a disconnection in the graph should be deemed invalid.

gpicron commented 1 year ago

I propose 2 additional rules

the timestamp of a message MUST strictly larger that the timestamp of the "backlinked" messages whatever the relation (previous, reply, etc.)
a peer should not accept a message which has a timestamp larger than its own clock. It can just ignore it when it receives it or it can store it temporarily until its clock reach the timestamp then consider it as valid and propagate.

It has the consequence to create a kind of local consensus of elapsed time between peers that interact regulary and can server multiple purposes for higher level protocols.

erikmav commented 1 year ago

timestamp of a message MUST strictly larger

@gpicron what is the current or proposed clock tick interval (resolution)? 100ns Windows high performance counter (encoded as int64 ticks since some root time)? Unix time (1 sec resolution)? I can see the greater-than rule making sense for something in the <=1ms range, as that's below typical rate of human activities, whereas 1 sec resolution makes more sense as a "greater-than-or-equal" operator.

It's worth expecting that some feeds will be bots and may issue relatively fast messages, faster than 1 sec interval.

ahdinosaur commented 1 year ago

my mental model of verification:

signature verification: does this message have a valid signature that matches the author's public key?
previous (backlink) verification: did this message come after the previous messages that are referenced?
root (log) verification: does this message have a causal link to the root message of the feed or tangle?
- https://github.com/ssbc/ssb2-discussion-forum/issues/13#issuecomment-1459004212

ahdinosaur commented 1 year ago

@gpicron: i disagree about monotonic (only-increasing) timestamp (clock) consensus. seems contrary to Scuttlebutt's "against consensus" design principle. i think monotonic timestamps only make sense for a single author (clock source). for example would cause issues if you have two disconnected social islands meet and their clocks are out of sync, or if you want to socialize across interplanetary (relativistic) clocks. what's the benefit of timestamp consensus, what higher level protocols need timestamp consensus that isn't handled with tangle causality?

gpicron commented 1 year ago

Unix time (1 sec resolution)? I can see the greater-than rule making sense for something in the <=1ms range, as that's below typical rate of human activities, whereas 1 sec resolution makes more sense as a "greater-than-or-equal" operator.

Currently the timestamp in classic ssb format are Unix milliseconds.

I have no strict opinion on the minimum . Today this rule exist yet (implementation in Js is doing that check) and the granularity is 1 ms. I don't see any good reason to good lower.

gpicron commented 1 year ago

i disagree about monotonic (only-increasing) timestamp (clock) consensus. seems contrary to Scuttlebutt's "against consensus" design principle.

The "against consensus" is against GLOBAL consensus principle and refusing any form of centralisation. What is proposed here is Local between people that are interacting at some point. It just map the tangle backlinks in some relative and possibly subjective measure of time elapsed since the tangle root(s) of a given message.

gpicron commented 1 year ago

i think monotonic timestamps only make sense for a single author (clock source).

Once you start to think at partial replication to keep only "recent" information locally, everything is tangle, expiration of keys (and renewal), you need some consistency of timestamps from the observer point of view looking at several sources interacting to each other. Else it is better remove them completely the timestamps. But if we do so, how do we say this post is old by more than 1 year ? How one can tell at another "I'm only interested in post of the last 2 years".

for example would cause issues if you have two disconnected social islands meet and their clocks are out of sync, or if you want to socialize across interplanetary (relativistic) clocks. what's the benefit of timestamp consensus, what higher level protocols need timestamp consensus that isn't handled with tangle causality?

Re-read the rules proposed. Only the first rule is strict, for the consistency across participant of a tangle. The second one is "should" and the notion of "local clock" is free to define by the receiver. For instance, a device with no real-time clock can use received messages from several sources to get a notion of elapsed time and define its own local clock movement. That permit even such device to have a sense of is "old" information and what is "recent" information without having to replicate all feeds or all sources.

I propose 2 additional rules

the timestamp of a message MUST strictly larger that the timestamp of the "backlinked" messages whatever the relation (previous, reply, etc.)

a peer should not accept a message which has a timestamp larger than its own clock. It can just ignore it when it receives it or it can store it temporarily until its clock reach the timestamp then consider it as valid and propagate.

I should probably have avoided to the word Consensus that is wrongly associated to Blockchains. But Consensus are something else and don't prevent subjectivity. It just refer to an agreement among a set of peoples that choose freely to interact with each other. The base of social interaction. Else everybody can choose to speak its own language (I will write in Dutch or in French, or I will invent my own language that I will be the only one speaking) without caring about the other capacity to read it.

ssbc / ssb2-discussion-forum

Validation and Verification #21