Closed staltz closed 11 months ago
Yes this could work. Notice that "domain" is kinda becoming the named tangle in this proposed solution. In ssb "type" was always a quick indexable point in a message by which you could do a sort of bloom filter first pass to get those messages (and then do tighter validations, ideally by schema)
Remember p2panda has feeds defined by schema? Like the domain is the hash of the msg schema (or something like that), then you add that as a validation step in replication. Could be nice to only validate the messages once (on write) instead of on every read. Is that relevant here? Not sure. I think I'm feeling a tweak like "uhhh, what is the domain for again... and how do you have epochs with same domain?
Sympathetic replication comes in here too. I'm thinking about the private group challenge ... does it intersect? I need a checklist of behaviours I think. Values and then expected behaviours which support those, then spec which support those
The case I was trying to raise the other day was not "ho do you know if it's one document or two" (though that's a great question).
It was - in a scenario where we have a message which has two chunks of unnamed tangle data in it... what do I do with that? It would indicate the message is a (non root) message in two (non-account) tangles right?
Examples from tribes2 : an "group/add-member" message in the second epoch of a group would in ppppp need:
If I am looking at a message 1's special position makes it obvious, but how do I distiguish what 2/3 are? I would have to look the tangle root messages up and look at their domains? Maybe the previous massage domains too? Do I ever need to solve this problem in real apps or do I always start queries coming from higher context. This is where I want a worked example to calm my paranoia
Uhhh I'm getting weird flashbacks of willow's capability spec draft. everything is multi author documents... with capabilities. Your triple range for replication is strangely similar to the 3d product in that spec
We only have the special "account" tangle because we expect to have a density of desire to get many things by the same person right?
´´
Yes this could work. Notice that "domain" is kinda becoming the named tangle in this proposed solution.
Thanks! This is the main answer I needed to get from you. :)
(and then do tighter validations, ideally by schema)
Remember p2panda has feeds defined by schema?
I know, and it has been in my mind, but at this point I don't see a need for very tight validations (and as a pre-requisite, a canonical way of representing schemas and hashing them). The "first pass filter" based on a domain string is well enough.
It was - in a scenario where we have a message which has two chunks of unnamed tangle data in it... what do I do with that? It would indicate the message is a (non root) message in two (non-account) tangles right?
I think it will be helpful to set aside private groups (and exclusions and epochs) for now because it's clouding your/mine ability to see PPPPP tangles for what they are, which are different to SSB tangles. Our group epochs design is using SSB primitives (feeds/metafeeds/tangles) to achieve goals, and those same primitives don't translate 1:1 to PPPPP. Tangles are not the same concept, and maybe we should just (in this conversation) call them tongles just to emphasize the difference. Let me explain.
In SSB you have feed as the primary primitive, and tangles as a secondary concept built on top of feeds.
In PPPPP you have tongles as the primary primitive.
So let's talk about what's a "primary primitive". I see it as being "the thing you replicate among friends", or more mathematically, "an evolving set of messages that have a stable and collision-free ID for replication". In SSB we replicate feeds to each other, and tangles are just extra metadata added to weave together a partial ordering. In PPPPP we replicate tongles to each other. The "collision-free" part is important because while it would be nice for my SSB feed to be called @staltz.ed25519
, if we would use that for replication, any rando could call themselves @staltz.ed25519
and mess up the replication of my feed. That's why SSB feeds are identified with a hash. Similarly, that's why PPPPP tongles are identified by a hash, not by a human-friendly name.
So in this light, the main difference between a PPPPP tongle and an SSB feed is that tongles are DAGs thus allowed to fork (and merge), while a feed is just a linear sequence of msgs. While an SSB msg uses sequence numbers and a previous field to determine "what comes after what", a PPPPP msg uses backlinks instead. Notice that the only purpose of a backlink in a PPPPP tongle is to provide partial order (we have no sequence numbers), especially notice how lipmaa links point back to stuff just based on graph distance, NOT based on "related content" (unlike SSB msg.content.fork
and msg.content.mentions
). I.e. the backlinks in a PPPPP tongle are not very semantic, they are just fields that help with replication.
PPPPP "External feeds" (which are tongles!) are the closest thing to SSB "feeds". But now let's talk about an outlier: PPPPP "cross-tongle tongles". Again, a tongle is just an evolving set of messages that have a stable ID for replication, so in case you want to replicate a set of msgs from various authors, we want to support that too. That's the use case for replicating a discussion thread without having to follow/replicate all authors involved in that discussion. So we allow msgs to declare that they belong to a tongle evolving set of messages with a stable ID for replication, via msg.metadata.tangles
.
Let's resist the temptation to 1:1 map this to the "epoch tangles", "membership tangles" and instead try to look at PPPPP tongles for what they are, and then think how would we design a new groups system that allows for excluding members. It might end up like what we had, or it might look different.
I chose that example because it is a rich one we had shared context on
I think I haven't quite grokked the difference between these tangles (ssb, ppppp) because I'm seeing them as similar. I likely need to do some worked examples/ diagrams
Easy suggestion: in the context of PPPPP, forget about "tangles", just think "forkable/mergeable feeds" instead. Much easier to understand what it's about.
I'm considering renaming some concepts such that "tangles" disappears from our vocabulary (it has been confusing enough for SSB veterans, and it's more important that the SSB community doesn't get confused by this, compared to other audiences).
I'm considering calling the "DAG" a "feed" instead, since it's the closest in purpose to the SSB feed, and it has a cryptographic ID which is used during replication protocols, etc. Here's the glossary I got so far:
msg.metadata.account = "self"
and msgs only add or remove cryptographic keysSo the biggest thing is this hierarchy:
Each kind will have their own validation rules (with most rules in common, however):
msg.metadata.account
in the moot msg defines who can publish on this meal, subsequent msgs must have the same account
msg.metadata.account
Examples of meals:
account=ALICE_ID
and domain=post
)account=BOB_ID
and domain=follow
)account=GROUP_ID
and domain=members
)account="any"
and domain=hubs
)Examples of weaves:
I like the naming moot a lot, because it rhymes with root and the dictionary says (bold is my emphasis)
moot debatable; undecided: a moot point; disputable, unsettled
Which is quite accurate, because anyone can recreate a moot msg, so it's "unsettled" and there's no guarantee that the moot's msg.metadata.account
author intended to publish this. It only becomes a real thing once that account
actually publishes a msg referring to the moot ID.
I am not sure if I like the naming meal, it would be quite new in this realm, and we'll be using this word a lot since meal feeds are "actual content" feeds. I'm open to suggestions, but I'm looking for a short and simple word, not an acronym, not a composite word like "moot feed". We're going to be talking about this kind of feed a lot, so the best would be 1 syllable or 2.
I like weave, it has a connotation of blending different strings together, which is apt.
Curious about your thoughts @ahdinosaur @mixmix
Considering "meed" as a portmanteau of "feed" and "moot", and it actually has a dictionary meaning:
A meed is a well-deserved compensation or reward. At a birthday party, every guest hopes to gather his or her meed of candy from the piñata they've worked so hard to smash open.
The noun meed is a very old fashioned way to talk about a payment or share of something. You're most likely to come across it in older books, but you might want to use it to describe the way your grandmother manages to give each of her twelve grandchildren a meed of her attention and love. Meed comes from the Old English root mēd, which has a Proto-Indo-European root in common with the Greek misthos, or "reward."
It's a bit exotic, but has the advantage of being a portmanteau and recalling both feed and moot concepts.
i for one really like the name "tangle", to me is a fun way to describe a DAG.
in my view, there's a very very limited number of people who actually understand SSB tangles. i think if you are clear about what a PPPPP tangle is, i personally don't see an issue with the SSB overlap.
an alternative glossary if i understand this right:
Msg = published data that is signed and intended for gossip replication
Msg ID = hash(msg.metadata)
Tangle = any single-root DAG of msgs that can be replicated by peers
Root = the origin msg of a tangle
Tangle Tips = tangle msgs that are not yet referenced by any other msg in the tangle
Tangle ID = msg ID of the tangle's root msg
Account = a kind of tangle, where msg.metadata.account = "self"
and msgs only add or remove cryptographic keys
Account ID = tangle ID of the account tangle
Moot = a root that is deterministically predictable and empty, so to allow others to pre-know its msg ID
Feed = a kind of tangle whose root is a moot
Weave = any tangle which is not a feed nor an account
hierarchy:
💎 Tangle (DAG of msgs)
@ahdinosaur yeah you might be right, I shouldn't go too exotic with "meed". I can already foresee people joking about "weed" and "mood" too.
I like Tangle+Account+Feed+Root+Moot.
The main problem is helping Mix understand how PPPPP Tangles are not SSB Tangles. 😅
how about "doot" instead of "root" - deterministic root
droot
@staltz is there any constraint about who can add messages to different tangles?
e.g. an account tangle can only be validly appended to by author keys that have been given permission to right?
@mixmix
An account tangle can only be appended to (i.e. msgs published on it are only considered valid) if the msg's pubkey is authorized to do that action on the account, e.g. add
power: is the pubkey authorized to add
other pubkeys?
On feed tangles, only the feed's account is authorized to publish on it. Note how the moot is deterministically defined by two data: account
and domain
, so the account
there determines who can publish. Also note that for "commons feeds" you can set account='any'
in the moot and that's a special case that means anyone can publish on this feed.
And on weave tangles, anyone is allowed to publish on it.
Closing this issue because both things were resolved: the renaming and the proposal to do rootMsgId × range × domain during replication.
One PPPPP design choice that has confused @mixmix in particular (but I bet others would raise the same concern sooner or later) is how the tangles a msg belongs to are referred by the tangle root ID. This is distinct from SSB tangles where they are human-readable names, e.g. SIP009 or group exclusion tangles for a more colorful example.
Msg format recap
Msgs in PPPPP are so far like this:
Question
From this design choice, an import question comes up:
Mix has also once asked this same question on SSB, with different words:
And yet another time today:
To visually illustrate the above, see the diagram below (red are lipmaa links):
The document ABJKLM is disjoint from ACDEFGHI (except for A).
Answer: it depends on what is the nature of these two disjoint documents. If this whole tangle is a discussion thread, and all these msgs are
domain="post"
, then yes it would make sense to replicate everything here, and topologically sort both documents and render them blended together in one place. This would be a prime example of a Reddit-like discussion, for example.:fire: However, if the two documents are vastly different in nature, then it may make sense to replicate only the one you want. A real world example would be: ABJKLM are msgs with
domain="post"
while ACDEFGHI are msgs withdomain="factcheck"
, where fact-checking is a meta-discussion that may be suitable to replicate separately from the actual posts.In that case, the question is very relevant and the current design of PPPPP does not solve for this. (Keep on reading because I have a proposal)
Tangle sync recap
To give a recap on how replication works as a black box (that is, just considering its contractual inputs and outputs, but not going into the details of how it is accomplished):
rootMsgId
×range
rootMsgId
, and which are located in the givenrange
A "range" is an array
[minDepth, maxDepth]
.The algorithm itself does a lot more magic with bloom filters to efficiently exchange msgs between two peers, but at the heart of it there is a method that looks like this:
As a comparison with SSB, there is only one dimension for the input, which is the
feedId
. In PPPPP tangle sync, we added another dimension,range
, to allow sliced (partial) replication. Thanks to lipmaa links, we can still validate whatever msgs we get, all the way to the root.Proposal
I think we could solve the case that the question raised by simply including one more dimension: the
domain
. So that the tangle sync contract would be:rootMsgId
×range
×domain
:sparkles:rootMsgId
, which are located in the givenrange
:sparkles: and match the givendomain
This would be a simple matter of updating that JS code to be:
Note how the
post
+factcheck
example would be replicated: now I can independently ask forrootMsgId
×range=[0,Infinity]
×domain="post"
to get all the posts in that thread without getting thefactcheck
messages.Optionally, you could pass
domain=null
to signal that you want messages with any domain.