Open staltz opened 1 year ago
Some initial thoughts:
Yeah this is a hard problem. One that I never found a good solution for. Good defined as something that is orthogonal to other constraints and thus can be solved once and you can build on top of that. I think this mostly comes from the fact that you are working on distributed state and that has a bunch of edge cases.
One thought that came to mind is that in a tangle, once someone extends another keys tip, then that in a way confirms that that messages was fine in this context. Maybe it is possible to use that information somehow to build a simpler model?
Related work:
It is indeed a hard problem. I've been sitting here looking at the ceiling trying to think WHY it's hard.
I came down to a trilemma:
Choose 2. :smiling_face_with_tear:
Actually, there is one way to have all three, but it's not pragmatic: your main keypair (whose pubkey identifies you) would live in your brain alone, and you would sign messages by hand by running the elliptic curve operations in your brain. Then we can have all 3 properties. :upside_down_face:
I would love for the trilemma to be false, and for there to be some magical way out of it.
Assuming there isn't, we have to make the tough call of dropping one of those properties. Let's take a deep dive into each of those worlds, and try to play out the far future in that world.
i.e. drop "I am known everywhere by the same identifier".
In this world, you would have a proliferation of keypairs, and public keys would be nearly useless as identifiers. We would have to come up with some other system of identifying people. Maybe piggybacking on DNS?
We need to figure out who you are, either when validating msgs that come from "you", or when trying to connect via SHS.
But we would have good security, and all devices would be treated equally and apps.
Account recovery would be weird. Say you had only one device/app and it exploded, then you buy a new device, install the app, and there's no backup recovery phrase to insert, since the new device/app would start its own keypair, and because you don't have the old device anymore, it would be impossible to "link" together the new and the old. You would have to re-onboard to the network.
Multi-device SSB is almost like this already, and we have just used manual linking in the bio to link identities together. This world could also work for PPPPP, and it would tie well with the principle of identities being relative to adjacent peers in the social web, as opposed to identities being globally unique/known (Twitter) or self-determined (via cryptographic keys). Identity would be tied to your surrounding community, and that could be a good thing.
On the other hand, in this world, feed tangles wouldn't make much sense and we could just go back to append-only feeds.
i.e. drop "Keys are not shared"
This would compromise security, a lot. But for the sake of argumentation, let's pretend that wouldnt be a problem.
PPPPP is already designed for this in mind, and tangles would work fine, and the story around account login and recovery would be simple. The main problem would be handling SHS in a scenario where two devices connect to each other although they use the same keypair. Fun fact: SHS already allows you to connect like that! So the other challenges would be discoverability and disambiguation in rooms and LANs.
It could work, but it's hard to imagine it could survive the test of time.
i.e. drop "All devices and apps are equal".
Okay, we get a stable ID everywhere, and we're not sharing keys.
But we're introducing a clear hierarchy, and one of your device/apps would be the king. That main device would be a single point of failure, and you would have to treat it more carefully. Comes to mind is how Signal does identity too, and there are downsides there. There's also the downside of losing some decentralization to this main device.
That said, let's think about it more positively. We don't need decentralization of your identity. We can have decentralization in other aspects of the network, such as server topology (and ephemerality). Having a stable ID is really nice. Oh, we wouldn't have the network identity problem of the same keypair being used by two different peers. We would still have to "prove" that we are the same person as the main ID, but that's doable.
Another positive thing is that centralizing your main ID in one device makes for a simple model for users to understand. It's also possible to move the main ID to another device, if you need to. Account recovery is simple, and you can do it on any device.
Heck, you could even have the main ID on two different devices simultaneously, without fork dangers. Of course, you shouldn't do that, and I think it'll be hard to tell people why they shouldn't, but at least if something goes wrong it's your own fault. I just hope 3rd party apps don't start to ask your main ID recovery phrase.
I'm trying to make my mind between Identity Limbo and Main Device, with a slight preference for Main Device, because identity is such a murky and weird idea in Identity Limbo. I feel like that holds some of its own monsters and complexity that we're not aware of, while Main Device is a simple idea without hidden monsters, and we're aware of the cost it takes on decentralization principles. In other words, I could be fine with Main Device.
yes please tangle permissions, i want this to be possible, this is very exciting. :pray:
maybe some silly questions:
Account recovery would be weird. Say you had only one device/app and it exploded, then you buy a new device, install the app, and there's no backup recovery phrase to insert, since the new device/app would start its own keypair, and because you don't have the old device anymore, it would be impossible to "link" together the new and the old. You would have to re-onboard to the network.
i do believe Keybase uses something like this, because they are very clear that if you lose access to all your added devices, you lose access to your account. so they recommend adding multiple devices.
Yay, more feedback!
yes please tangle permissions, i want this to be possible, this is very exciting. pray
I agree it's exciting and opens up a lot of experimentation.
wouldn't the tangle identifier be the hash of the first message published in the tangle?
This is already true in feed-v1 today. But:
why can't this be the identity you are known by? (and could be given an alias with DNS, etc)
Because you have many feed tangles. You have a feed tangle for post
, another one for profile
(a.k.a. about
), another one for reaction
(a.k.a. vote
), and each of those have different "first message published in the tangle" a.k.a. root msgs.
then wouldn't devices need to still identify in SHS and such as their keypair?
Yes, and during (room and LAN) discovery, peers would have to prove that they are equivalent to their main keypair, by showing some signature signed by the main keypair.
Keybase
Interesting, I'll take a look at what they do, and think about this
Note to self: read this and see if there's something to be inspired with https://github.com/dxos/dxos/blob/main/docs/docs/design/halo-spec.md
why can't this be the identity you are known by? (and could be given an alias with DNS, etc)
Because you have many feed tangles. You have a feed tangle for
post
, another one forprofile
(a.k.a.about
), another one forreaction
(a.k.a.vote
), and each of those have different "first message published in the tangle" a.k.a. root msgs.
oh i see! then what if we had a separate "identity" tangle, and every other feed tangle must refer to the latest tip of the "identity" tangle they are participating as.
then wouldn't devices need to still identify in SHS and such as their keypair?
Yes, and during (room and LAN) discovery, peers would have to prove that they are equivalent to their main keypair, by showing some signature signed by the main keypair.
but in a tangle world without main keypairs, wouldn't the identity tangle have a message saying "we added this keypair to the tangle", and then peers just need to present one of the keypairs added to the tangle?
i'll try to wander with some messages...
(click to expand)
One thought that came to mind is that in a tangle, once someone extends another keys tip, then that in a way confirms that that messages was fine in this context. Maybe it is possible to use that information somehow to build a simpler model?
i notice that when you refer to a identity tangle, you are referring to the state of the identity that you accept. if there's a new identity message adding a new device, when you make a new post message referring to that identity tip you are accepting the change in the permission state.
The tricky part is removal. Lets say you have 2 devices and one gets compromised, which device wins and is allowed to continue using the id? If you want a stable id, then you have to have some way for a tie-breaker. Either a complicated protocol or as staltz said using a main device. This could also be a master key that you keep offline for rare situations like this. You can't really do that for groups which is why we needed https://github.com/ssbc/ssb-group-exclusion-spec.
Lets say you have 2 devices and one gets compromised
in PGP / GPG when you generate a key you also generate a certificate to verifiably revoke that key. :key:
for cases where your key is compromised, could you publish an identity/revoke
message (from the compromised key) and point to the last "good" identity
message from that key? would subvert if someone tried to change identity permissions after gaining access to your key. but probably has more edge cases than i can imagine.
(I'm back! Had to focus on life and work for a while)
Identity tangle
@ahdinosaur Yes, this is a good idea! It also crossed my mind that the identity tangle could be a ppppp-set, so it could have identities added and removed, and msgs pruned over time.
But I'm not sure how we're going to use identity tangles in practice. Does the identity tangle root msg become your "feed ID"? Okay, let's run with that idea for a moment.
(:bomb: Part of me is trying to avoid "looking up feed IDs" due to my negative experience with ssb-meta-feeds, so the identity feed isn't immediately exciting to me, but it could have some purpose if designed well. Let me set aside that feeling for now.)
---
title: Alice's identity tangle
---
graph RL
R[desktop pubkey adds<br />desktop pubkey]
A[desktop pubkey adds<br />phone pubkey]
Rid[Hash of the tangle<br />root is Alice's ID]:::weak
A--->R
Rid-.->R
classDef default fill:#bbb,stroke:#fff0,color:#000
classDef weak fill:#fff0,stroke:#fff0,color:#000
Important note at this point: these msgs would have pubkeys in the who
field. Like usual.
Now, suppose the identity root msg hash is MTYQM89hvHuiVKaw8Ze7kc
. This is "Alice's ID". How does this become useful?
Maybe Alice could start post
feeds or things like that, where who
is MTYQM89hvHuiVKaw8Ze7kc
and type
is post
. This is a special case, though, because formerly the who
used to always be a pubkey, now it's a msg hash.
(:bomb: I don't like the heterogenous type for who
, but let's handwave that out of the way for now. )
When Alice published a post
msg on her post feed, it'll be signed and stuff, but peers who get that msg won't be able to immediately validate it. They will have to first fetch Alice's identity tangle MTYQM89hvHuiVKaw8Ze7kc
, collect all the pubkeys there, and then try each pubkey on the sig
. If it's validated, then great.
Let's test this design against the problems mentioned in the original comment in this issue.
Well, we don't have tangle auth as originally described, because the post
feed doesn't need to have special "write access" messages. It's all externalized in the identity tangle.
In that sense, the identity tangle can be used for purposes other than defining a group of devices. It can define a group of people AND their devices. So we can get tangle "write access" verification after all, if the pubkeys are externalized in some identity tangle.
:bomb: The downside is that if an identity tangle has a lot of pubkeys, say 100 of them, then you have to test each of those 100 when validating msg.sig
.
This seems okay! It's just a matter of replicating the identity tangle, discovering all the pubkeys, and then you allow all these pubkeys to be used in SHS whenever you're talking with "MTYQM89hvHuiVKaw8Ze7kc".
:bomb: We gotta be careful with chicken-and-egg situations though, like if you are trying to connect with MTYQM89hvHuiVKaw8Ze7kc
for the first time ever, how do you replicate that identity tangle if it's only available by connecting to MTYQM89hvHuiVKaw8Ze7kc
?
Doesn't seem like an issue here because the write access info is not in the interesting feeds, it's externalized in the identity tangle.
Doesn't seem like problem either, because any pubkey in the identity tangle can create a new feed.
Signature validation performance seems to be a problem in this design. We might have to find ways of making it faster, like perhaps having both msg.metadata.tangleIdentityRootMsgHash
(ignore the name choice) and msg.metadata.pubkey
, so you could just use msg.metadata.pubkey
(formerly who
) to validate the msg.sig
and then use msg.metadata.tangleIdentityRootMsgHash
to check that the pubkey
belongs to the identity tangle.
@staltz, did you look at my example messages in https://github.com/ssbc/ssb2-discussion-forum/issues/24#issuecomment-1545730562? i'm curious what you think, because you proposed a slightly different approach and i'm wondering if that was intentional. i feel each message should be signed with a specific device pubkey (i rename who
to device
), rather than the who
be the generalized identity tangle hash. the other important thing is for each non-identity message to reference the identity tangle, so they point to the latest "identity state" message.
@ahdinosaur i did see it, just didn't have time to comment everything. The example was useful!
I see what you were going for, but the msg.metadata.tangles
doesn't work like that. Semantically, having that tangle field with the identity tangle in a post
msg means that the post
msg belongs to the identity tangle, which is not what we want. But indeed, you're right that it would have to reference somehow the identity tangle root and the known tips so to inform what is the state of the identity tangle when the post
msg was created.
We can come up with a new syntax for that. I feel like we are stretching the meanings of these fields, which means we are probably going to need a slightly different feed format design. For now I think it's better to think about all these things in the abstract and then look for a feed format design that supports it.
I don't have much time currently so I will not be able to go in details and/or participate actively to the discussion. Meanwhile, I have yet exposed the main ideas of what I working on in previous threads. And I will try to summarize. If it fit you need, we can see how to join our efforts.
Plan is to have separate app for managing identities and use SSB as opportunistic synchronization framework. Thinks a bit like a Password Manager app, but that acts as Decentralized Authority for your own identities. The main goal is: user can have several disconnected identities, it is hard to link 2 identities, offline caps.
For instance, I would like it to be able to propose an alternative that is as user friendly to Google, Github, etc. single sign-on but that prevent the capability to link activity of a user on various website and with the level of safety of a multi-factor authentication.
For the general approach, the user has a MasterPassword choosen by himself but must pass a minimal level of entropy (I use zxcvbn from DropBox with some tweaks to evaluate the password strengh).
This password is never stored anywhere. It is used to generate "master seed" of 512 bits argon2d with "sensitive" settings (require 256MB and 3 rounds) to make it hard to bruteforce (it takes about 60 seconds to generate the seed on a powerful computer). That seed is a kind of TOTP that change every year.
From that master seed, I derive Identities (using a context string and the year of creation and a random padding generated from the masterseed ). An identity is a stable identifier over time, it is a byte string. (I use encryption here, because I need to be able to get the context string from the identifier), the encryption key is derived from the masterseed.
The identity byte string is used to generate a identity seed (again argon).
So the DB that shared by all my devices is only the list of {idenditybyteString, {field, counter})) encrypted with a key derived from the master seed. Purpose is that even if it is decrypted, it does not reveal much
Then that seed is used for various generations of data/password.
For instance, I personally use it as password manager so far. It is pure command line and lack a lot of features (sync is not using SSB yet).
For servers that I manage, have the following pattern:
For all these, I mainly use argon and blake
My tool tell me to rotate accounts password and key when I use it (every month)
One interesting aspect is that even without the db, I can in a few trials recover a password if I remember my master password, the context string (the server name in my db) and more or less the last year-month I refreshed my password/keys.
So I install the tool on most of my servers without any db. (the db is actually only on my personal computer and office computer)
I have the similar for websites, just that currently this is not so good solution as most of the time they requires a email to check, so the link between identities is there and the effort to create new address each is large. For that, I'm thinking about something using P2P collaboration scheme to anonymize email addresses (think Tor but bridging SMTP to SSB private messages.) That would provide to services the valid goals (multifactor authentication, privilegied communication push channel to their users) but prevent identity linking with minimal impact on their current process. Another option is simply that services have a bridge SMTP/SSB in their infra but that requires their collaboration and wish to preserve their users from identity linking practices.
The main point is the Master Password. If someone get access to it... There is no magic. I'm thinking at a schem using one-use revocation key to keep on a paper in a safe and that would permit one knowing both the Master password and that key to inform all servers/services via SSB that the identities was stolen and must be revoked. (and then process to recreate one and recover accounts). I'm currently digging papers to find a commitment scheme that could permit that.
@ahdinosaur I started sketching "v2" of the feed format in light of the ideas we had here. (And as a good moment to rename some fields)
Seems like we could make a "group" a primitive in the protocol. An "account" (or "identity") is just a group of devices (or a group of public keys). Similarly, a "community" / "club" / "team" is just a group of public keys. This should also pave the way for private groups, since (in theory) this is just a matter of a group's tangle msgs being encrypted. Groups could be a built-in PPPPP feature!
So to make it a bit more clear:
With this, the main realization I had is that we could replace the msg.metadata.who
with msg.metadata.group
. Because e.g. to make it possible that any of your devices can start a feed, we need the feed root to be predictable. Other peers should need to know only the Group ID, not the pubkey which started the feed.
The rough structure would be:
const msg = {
data,
metadata,
pubkey, // former `who`
sig
}
Notice that the pubkey
is outside of the metadata! Instead, we have the msg.metadata.group
as the identifier. This has some very interesting implications. It means two of your devices can author (accidentally or not) the same msg, and it'll have the same Msg ID! For instance, two of your devices could have simultaneously started the post
feed, but this should still yield the same msg.data
(which is empty) and msg.metadata
, hence the same Msg ID! When replicating, it doesn't matter from which pubkey I got the message from. Any one of those pubkeys are equally authoritative!
:bomb: However. My main dilemma right now is: Does a msg include the just the group ID or does it include references to the group's current "state"?
thanks for all this @staltz, looks great. :relaxed:
If including the group's current state:
- CON: remote peers cannot derive the tangle ID for Alice's posts
does the feed root need to reference the identity state? seems that could be a special case.
i will admit, the "deterministically predictable feed root" and "pubkey
outside the metadata" feels wrong to me, but also i'm susceptible to wanting things to be done the "correct" way and terrible at cutting corners. i'm happy to try to accept these as-is and see how things go.
so in pursuit of the "correct" way i'm partial to including the group's current state in every message (except a feed root message). i do agree that identity tangles make sense to be replicated first, before replicating content tangles.
but of course my mind wonders what would happen if we flipped the script, what if the identity tangle "announced" the creation of feed tangles, where those msg IDs became the tangle IDs. no more need for determistically predictable feed roots. the CON is the identity tangle is larger, also i'm not sure if this is the same as the meta-feeds approach that rubbed you the wrong way.
I see what you were going for, but the
msg.metadata.tangles
doesn't work like that. Semantically, having that tangle field with the identity tangle in apost
msg means that thepost
msg belongs to the identity tangle, which is not what we want. But indeed, you're right that it would have to reference somehow the identity tangle root and the known tips so to inform what is the state of the identity tangle when thepost
msg was created.
i had a wonder about our tangles object and was wondering if they should be more semantic.
for example:
{
"data": {
"text": "yo i heard you like tangles!",
},
"metadata": {
"data_hash": "K8tzhL8Sewr3mVr1dpaYa2",
"data_size": 206,
"data_type": "post",
"tangles": {
"identity": {
"root": "QwrP7DAMHhHe71Qf87tXBf",
"depth": 2,
"prev": [
"72MK9ETRNGKm7Jh8ryJswM"
]
},
"feed": {
"root": "SRGfAAnxTzjN6mEDJ542hf",
"depth": 2,
"prev": [
"NWmZDGa64kiY2cDaM3u8c2"
]
},
"thread": {
"root": "RG3uXAiKFiwGjYQs6s4Adr",
"depth": 2,
"prev": [
"otYDrKTZZ1ZgDVgTBeBZ6v",
"5jdPWyyniKoeukdVoZVxUA",
]
}
},
"pubkey": "8kBhDXpZajdBRFLq8zophqCzbFsFzvuwBGoWj7TU9Loe",
"v": 1337
},
"sig": "5abJdD6RRCsWXKJLaEKRhUb1HKh4aKPFteFRgUBfyJD4cFzo5MVaMdWbwM2CfpNRFSjR9NkczRL2LcSyQVThYnRr"
}
then we could treat a reference to the identity
tangle differently.
i had a chat with @mixmix and he mentioned semantics being important for how he uses tangles. i notice other specs (like meta feeds and group exclusion) using tangles have semantic references.
that being said, you could certainly "derive" the "meaning" of the tangle with enough traversing, but is there a benefit to not including the "meaning" in the message?
i will admit, the "deterministically predictable feed root" and "
pubkey
outside the metadata" feels wrong to me
It is possible that pubkey outside metadata would lead to new problems, so it's important to be careful with this new design.
but of course my mind wonders what would happen if we flipped the script, what if the identity tangle "announced" the creation of feed tangles, where those msg IDs became the tangle IDs.
Yeah, I wouldn't rule out that tactic. It might not be as bad as ssb-meta-feeds. I'll consider it.
i had a chat with @mixmix and he mentioned semantics being important for how he uses tangles.
Yes Mix had raised the same concern with me when I showed him this tangles design. But PPPPP tangles are doing something different to SSB tangles, and as soon as you introduce semantic names, it makes it possible to have two different roots for the same name, and this violates one of the tangle constraints, which is: only one root per tangle. This constraint is actually very important for backlink validation and sliced replication. The semantic names also make it impossible for a msg to be part of two different "threads" or two different "feeds" because there is just one name.
In short, PPPPP tangles are a way of grouping messages for the purpose of replication of a shared data structure (well, might as well just call this a CRDT because it's a replicated data type, RDT). In SSB, tangles are not replicateable, they are just ways of logically grouping msgs so to have consensus on their causal order. This is why PPPPP tangles need more "machine-friendly" identifiers than human-friendly identifiers. Similar to how in SSB you replicate @FCX/tsDLpubCPKKfIrw4gc+SQkHcaD17s7GI6i/ziWY=.ed25519
(machine-friendly) not "alice"
(human-friendly), and the system would have severe bugs if you would change replication to be centered around human-friendly names.
does the feed root need to reference the identity state? seems that could be a special case.
Actually, this was a great suggestion, and it unblocked me with this design. So here I present what seems like it will work quite well for multi-device use cases (and ... multi-person feeds!):
const msg0 = {
data: {
add: DESKTOP_PUBKEY,
},
metadata: {
dataHash: '1800a9st',
dataSize: 32,
group: null,
groupTips: null,
tangles: {},
type: 'identity',
v: 2,
},
pubkey: DESKTOP_PUBKEY,
sig,
}
const msg1 = {
data: {
add: PHONE_PUBKEY,
},
metadata: {
dataHash: 'Dhc810cI1',
dataSize: 32,
group: null,
groupTips: null,
tangles: {
[IDENTITY_MSG0_HASH]: {
depth: 1,
prev: [IDENTITY_MSG0_HASH],
}
},
type: 'identity',
v: 2,
},
pubkey: DESKTOP_PUBKEY,
sig,
}
const msg0 = {
data: null,
metadata: {
dataHash: null,
dataSize: 0,
group: IDENTITY_MSG0_HASH,
groupTips: null,
tangles: {},
type: 'post',
v: 2,
},
pubkey: DESKTOP_PUBKEY,
sig,
}
const msg1 = {
data: {
text: 'Hello world',
},
metadata: {
dataHash: 'Cfo91ico5',
dataSize: 10,
group: IDENTITY_MSG0_HASH,
groupTips: [IDENTITY_MSG1_HASH],
tangles: {
[POST_MSG0_HASH]: {
depth: 1,
prev: [POST_MSG0_HASH],
}
},
type: 'post',
v: 2,
},
pubkey: DESKTOP_PUBKEY,
sig,
}
group
is the Group ID, and should be the identity tangle's first msg hashgroupTips
points to the latest state of the identity tangle, and in the special case of a feed root, we don't need it (thus it's null). Essentially anyone in the group (even if they will eventually be removed from the group) has authority to start a feed, but this is a rather boring thing because you can't publish any interesting data in the feed root.group and groupTips are always null in identity tangle msgs. I don't know, should these fields be omitted in identity tangle msgs? It isn't pretty.
This new feed design is not the most pretty syntactically (group
and groupTips
are basically defining a tangle, but not in the tangles
field), but prettiness is not that important. The reason why group shouldn't be inside tangles
is because we don't need the depth
(depth only helps when creating a new msg in the tangle, and in this case we can't and should not add msgs to the identity tangle while we are publishing on a common feed), and we are not declaring this msg to belong to the identity tangle. We are only referring to the state of the identity tangle.
Perhaps with more bikeshedding we can make this feed format pretty to the human eyes, but at present this v2 design seems like green light for me to build some prototypes and see how it works in the wild.
Here's how v2 fairs with the originally mentioned problems:
group
fields and the identity tangle for that groupremove
msgs in the identity tangle. There are more details here regarding conflict resolutions, but in theory this area is green lighttype
. Feed roots are still employed:tada:
Small problem came up during implementation:
As per the current design, a pubkey can only start a group once. If they start another group, then it's going to end up with the same group ID. So we need to add some kind of nonce to the group tangle's (I'm renaming it from identity tangle to group tangle) root msg.
UPDATE: oh this is easy, just add msg.data.nonce
. That should cause the msg.metadata.dataHash
to always be unique when starting a group.
How would encryption work for "groups"?
Say you follow someone known by the group ID XKKmEBmqKGa5twQ2HNSk7t
, how do you encrypt a private message so only that person (i.e. that group of devices) can decrypt it?
:bulb: Thought: rename msg.metadata.tangles[tangleId].prev
to .....tips
. This would align the nomenclature with groupTips
(i considered groupPrev
as a name) but I think overall "tips" is easier to understand than "prev" from an implementation perspective. You're supposed to just put the "tips" (the extremities) of the DAG into this field, you're not supposed to think about what came "previously" (since technically the whole DAG came previously).
Open question
How would encryption work for "groups"?
Say you follow someone known by the group ID
XKKmEBmqKGa5twQ2HNSk7t
, how do you encrypt a private message so only that person (i.e. that group of devices) can decrypt it?
in my mind, the identity tangle (XKKmEBmqKGa5twQ2HNSk7t
) would include a message that advertises a public key i can use for encrypting private messages. then the same "if a message is encrypted, try to decrypt with your available keys" as SSB.
i'm no expert but i reckon the SSB group specs have explored where we'd need to go:
in some of those specs i think private keys are sent to those added to the group using good ol' secret box encryption. so we'd need to either find a new way to distribute or derive shared keys amongst the group members, or we use our own form of secret box for this specific purpose.
bulb Thought: rename
msg.metadata.tangles[tangleId].prev
to.....tips
. This would align the nomenclature withgroupTips
(i consideredgroupPrev
as a name) but I think overall "tips" is easier to understand than "prev" from an implementation perspective. You're supposed to just put the "tips" (the extremities) of the DAG into this field, you're not supposed to think about what came "previously" (since technically the whole DAG came previously).
i was going to suggest groupTips
be renamed to groupPrev
so the vocab aligned, so yeah i support prev
being renamed to tips
:+1:
in my mind, the identity tangle (
XKKmEBmqKGa5twQ2HNSk7t
) would include a message that advertises a public key i can use for encrypting private messages. then the same "if a message is encrypted, try to decrypt with your available keys" as SSB.
Yeah, you're right, this isn't after all that hard. A device in the group tangle can announce a public key exclusively used for private messaging (not used for signing, etc) and then this device will share the keypair with other devices in the group. Yes, this has the vuln that whichever device leaks the private messaging keypair will compromise all the private messages, but this is actually the same threat model as SSB private groups (where there is a symmetric key shared to all members). I think it could work.
The only (minor) problem is that the group tangle will now publish two different kinds of information: a "Set" of pubkeys recognized for signing messages, and another "Set" of pubkeys recognized for private messaging. Having two in one tangle makes it a bit harder to perform PREDSL pruning, but maybe there is a way out of this. And maybe the group tangle won't need a lot of pruning after all (think, 2 devices is the most common case).
Having two in one tangle makes it a bit harder to perform PREDSL pruning, but maybe there is a way out of this.
okay, then what if these were two separate tangles? a tangle for the "Set" of pubkeys recognized for signing messages, and another tangle for the "Set" of pubkeys recognized for private messaging.
no reason comes to mind of why they can't be separate, the identity (pubkeys for signing messages) tangle is special because it affects permissions (the capability to write a message as the group), a slide-into-my-DMs (pubkeys for private messaging) tangle seems not special. the same edge cases that would apply (e.g. you were removed from a group but continue to publish new messages that point to the groupTips
when you were still in the group) are true of any other tangle.
:woman_shrugging:
@staltz It's great you folks are finally tackling the multidevice issue... once ready, it opens up for a new era of app experimentation. it's so worth it.
I do have a security issue that arises with successful multidevice. it's probably solved somewhere, but critical. but first, some conceptual tools to map the problem. we have:
For now, identity and device are one and the same. multidevice goal is to decouple them. Correct?
Security issue: each device has of course identity attached to it. the more devices, the more vulnerable for attacks (another self claiming your identity, with no authority to clear the issue). Since we're serverless, we can't rely on passcodes, passwords, verifications to prove who is whom. Any attempt just moves problem up, not solving it.
Suggestions:
user scenario: successful lawyer is active user in many communities... they bless devices with their identity with abandon. they have many old androids lying around in drawers on their many properties... to ask them to do an inventory of all their devices is... almost impossible.
user scenario, moar: lawyer is admin in many groups. having his identity stolen spells disaster to them and many others. how to help them?
okay, then what if these were two separate tangles? a tangle for the "Set" of pubkeys recognized for signing messages, and another tangle for the "Set" of pubkeys recognized for private messaging.
@ahdinosaur You're right, it seems that we can just publish these separately. Can still bikeshed the naming of it, but it would most likely just be a "subfeed".
groupTips
versusprev
I just realized that semantically we shouldn't rename these. prev
(in tangles) contains tips plus lipmaa references, so we can't name it tips
because lipmaa references aren't tips. We could rename groupTips
=> groupPrev
, but the semantics here isn't right, because groupTips
should not contain lipmaa references.
I guess they are different after all. :shrug:
For now, identity and device are one and the same. multidevice goal is to decouple them. Correct?
@nonlinear Correct :)
Security issue: each device has of course identity attached to it. the more devices, the more vulnerable for attacks (another self claiming your identity, with no authority to clear the issue).
We started this thread stating this security issue, and now we solved it. Devices will not share keys with each other, so if one device is compromised, only that device's keys are compromised. Your "self" is a group of devices, so the other devices could "downvote" or "remove" (whatever mechanism we come up with) the compromised device, meaning that the compromised device would be kicked out of your "self".
Alright, here's the code!
This is looking really good. It seems that using the tips it should be possible to reason about having a group a devices (self) being part of a multi people group.
@arj03 thanks!
You mean making a nested group? (A group that contains another group)
I guess one way to implement that is to add a group ID in msg.data.add
in the identity tangle. The data part could tell what is the type of the thing being added. msg.data.type = 'ed25519'
for the normal case and msg.data.type = 'group'
for nested groups. But we'd also have to encode the groupTips here...
@staltz selfie thanks
Maybe you can encode the removal policy in the identity init. That way an identify for me could have the master key removal policy. While a group of our two multi device identities could have the one we can up with for groups.
@arj03 Yes, something along those lines. For me this use case is not super important to design right now, but it's good to have an idea of how it could be designed.
I also think there could be removal policy in the identity root.
Another shower thought is: while previously we discussed with @ahdinosaur about having one "identity tangle" and another "private messaging keys tangle", I think we could revert to having a single tangle that defines everything about this "identity"/person/entity and then have custom prune algorithms for that (if needed at all! I think in the short term we might not even need pruning for this tangle, since it's going to be small and not often changing).
So the identity tangle msgs would have msg.data
as either:
{add: $PUBKEY, type: 'sign-ed25519', nonce?: string}
{add: $GROUP, type: 'group'}
{add: $PUBKEY, type: 'encrypt-ed25519'}
{del: $PUBKEY, type: 'sign-ed25519'}
{del: $GROUP, type: 'group'}
{del: $PUBKEY, type: 'encrypt-ed25519'}
This should also neatly allow for more key schemes in the future. Also, may consider if the network identity is the same as the signing pubkey or whether we should have separate network identity keys. I would default to reusing a keypair for both signing and network identity purposes, just need to come up with a convention for type
that makes that explicit, perhaps {add: $PUBKEY, type: 'sign-and-shs-ed25519', nonce?: string}
.
PS: another thing in my mind is trying to settle on a name, either "identity" or "group", but having both names is going to be confusing. I think I prefer "identity" because it's less likely to be ambiguous with other concepts. And the "identity tangle" makes more sense than a "group tangle".
PS2: one thing I realized that this current design allows is public tangles. This is basically a feed tangle where the feed root has group = null
. This creates a root where the only non-null value is msg.metadata.type
. One application is to produce a "pubs registry" tangle where anyone can publish to, and anyone could know this tangle's ID and replicate it.
(I hope this isn't adding noise.) Thanks for working on this format BTW. It overlaps a little bit with work I'm doing.
either "identity" or "group", but having both names is going to be confusing.
This is something that has confused me honestly. My naïve understanding is that an identity is a single person (with multiple devices), and a group is a collection of people. So they are neatly supersets — group
is a collection of identities, and identity
is a collection of devices.
another thing in my mind is trying to settle on a name, either "identity" or "group", but having both names is going to be confusing
i'm :+1: on "identity", but another option is "agent", which is what Value Flows uses to describe an individual or a group.
So the identity tangle msgs would have msg.data as either:
i love what you're thinking, and i'm sorry but i have to give some bikesheddy feedback: can we design msg.data
types such that unions (when there are multiple different types of valid msg.data
contents for a single msg.metadata.type
) are tagged unions? is much easier to reason about and implement in TypeScript, Rust, etc. :purple_heart:
(how to parse enums using Rust's serde
)
so for example:
{action: 'add', add: { key: $PUBKEY, type: 'sign-ed25519', nonce?: string} }
{action: 'add', add: { key: $GROUP, type: 'group'} }
{action: 'add', add: { key: $PUBKEY, type: 'encrypt-ed25519'} }
{action: 'del', del: { key: $PUBKEY, type: 'sign-ed25519'} }
{action: 'del', del: { key: $GROUP, type: 'group'} }
{action: 'del', del: { key: $PUBKEY, type: 'encrypt-ed25519'} }
in this case there are multiple 2 layers of union: action types -> keys types.
anyways, rant over, cheers :blush:
How would encryption work for "groups"?
Say you follow someone known by the group ID XKKmEBmqKGa5twQ2HNSk7t, how do you encrypt a private message so only that person (i.e. that group of devices) can decrypt it?
maybe worth looking at how tribes1 po-boxes worked?
@mixmix Suggested for identity tangles: to add a "consent" system similar to https://github.com/ssbc/fusion-identity-spec because you don't want anyone to randomly add your pubkeys to nazi identity tangles. So the new device can sign an attestation that "yes i want to belong to identity tangle known by the ID XYZABC" when the old device wants to publish the msg on the identity tangle.
@ahdinosaur I'm finally at the point where I'm bikeshedding/designing the identity tangle data. What do you think about the following?
msg.data
:{
"action": "add",
"add": {
"key": {
"purpose": "sig",
"algorithm": "ed25519",
"bytes": "3JrJiHEQzRFMzEqWawfBgq2DSZDyihP1NHXshqcL8pB9"
},
"nonce": "6GHR1ZFFSB3C5qAGwmSwVH8f7byNo8Cqwn5PcyG3qDvS"
}
}
{
"action": "add",
"add": {
"key": {
"purpose": "subidentity",
"algorithm": "tangle",
"bytes": "6yqq7iwyJEKdofJ3xpRLEq"
}
}
}
{
"action": "del",
"del": {
"key": {
"purpose": "sig",
"algorithm": "ed25519",
"bytes": "3JrJiHEQzRFMzEqWawfBgq2DSZDyihP1NHXshqcL8pB9"
}
}
}
interface Msg {
data: IdentityData
metadata: {
dataHash: ContentHash
dataSize: number
identity: 'self' // MUST be the string 'self'
identityTips: null // MUST be null
tangles: {
[identityTangleId: string]: {
depth: number // maximum distance (positive integer) from this msg to the root
prev: Array<MsgHash> // list of msg hashes of existing msgs, unique set and ordered alphabetically
}
}
domain: string // alphanumeric string, at least 3 chars, max 100 chars
v: 2
}
pubkey: Pubkey
sig: Signature
}
type IdentityData =
| { action: 'add' add: IdentityAdd }
| { action: 'del' del: IdentityDel }
type IdentityAdd = {
key: Key
nonce?: string // nonce required only on the identity tangle's root
consent?: string // base58 encoded signature of the string `:identity-add:<ID>` where `<ID>` is the identity's ID, required only on non-root msgs
}
type IdentityDel = {
key: Key
}
type Key =
| {
purpose: 'sig' // digital signatures
algorithm: 'ed25519' // libsodium crypto_sign_detached
bytes: string // base58 encoded string for the public key being added
}
| {
purpose: 'subidentity'
algorithm: 'tangle' // PPPPP tangle
bytes: string // subidentity ID
}
| {
// WIP!!
purpose: 'box' // asymmetric encryption
algorithm: 'x25519-xsalsa20-poly1305' // libsodium crypto_box_easy
bytes: string // base58 encoded string of the public key
}
Dang, now I really feel like renaming "identity" to "account". I think it better reflects what it is, and on the UI level we will be talking about "accounts" anyway, not "identities". Further, "identity ID" is just really weird.
maybe relevant for bikeshed https://sunbeam.city/@powersource/110768411610769697
:bomb: :fire: Problem: security
People are raising concerns that sharing your keypair to many devices is not good for security.
:bomb: Problem: unclear sign-in method
It's unclear when should you share the keypair, and when should you use something like broker auth.
When you're "logging in" with a new app, suppose a chess game, you probably don't trust the app developer that much to give your keypair to the app. You should probably use broker auth. But then, what kinds of apps can get the keypair and what kinds of apps should use broker auth? That line is hard to draw, and I bet many apps will end up just asking for your keypair, because it's simpler.
Now instead of spreading your keypair on "Manyverse" on many devices, you are sending it to all sorts of apps (closed or open source). Pretty bad for security.
:sun_behind_small_cloud: All tangles are multiauthor
One important realization I had is that the tangle data structure is by definition multiauthor. If it's single-author (and single-device), then it would end up being a linear sequence, so no need to have complex DAG algorithms. It only becomes a real DAG in the presence of concurrent (and delay-tolerant) authors.
Your "feed" (one kind of tangle) is authored by many devices.
A thread (another kind of tangle) is authored by many persons.
And so forth.
:bulb: Idea: write authorization for tangles
What if we add the restriction that keypairs are never shared? (Similar to #16) Then we solve both problems aforementioned, with an important tweak:
What if each tangle would declare which authors can write to it? Currently, a "feed tangle" can only be authored by the feed's keypair. But we could change this such that any authorized peer could write to your feed tangle! This is just a matter of tweaking the validation code so that it treats msgs from those authorized peers as valid.
In practice, this could be done by publishing a special message on that tangle that declares public keys that now have "write access" to this tangle. This should work for all kinds of tangles!
This would mean that if you want to login to your account on another device, you can just do auth like this:
:bomb: New problem: network identity
By allowing other keypairs to publish messages on your feed, we solve the feed problem, but the other area where keypairs are used is in Secret Handshake, and to identify you when establishing connections.
Now we have a problem, because even I follow you as
@abc123
, you may be using a different device with keypair@xyz456
which I won't recognize as friendly.Proposed naive solution
The two contexts where connectivity need to be treated are:
In rooms, it's possible to give the room a token issued by
@abc123
that proves that@xyz456
is the "same person". Then, when the room tells its members that@xyz456
is online, it can also annotate that "oh yeah by the way, this is the same person as@abc123
and here is proof".In LAN, remember there are UDP packets being broadcast to inform your multiserver address. This is a good place to include a proof that
@xyz456
is the same person as@abc123
.Maybe we could generalize these two cases so they aren't treated separately. Maybe this could be done on the SHS layer of the stack. Maybe there's a better way too.
:bomb: New problem: erasing
If the authorization data is in the
content
, we can't erase that msg. ("Erase" means to deletemsg.content
but keepmsg.metadata
andmsg.sig
).Proposed naive solution
We would have to change the
metadata
to include this authorization data.:bomb: New problem: omitting authorization
PPPPP sliced replication is such that for msgs at depth 100–200, I can send those plus the "certificate pool" between depths 0–100. The certificate pool is hopefully as small as possible, and just gives us the shortest path from 100 to 0.
However, any authorization msg is important and should not be dropped from replication. There may be several authorization msgs between depths 0–100, and we need to make sure they are replicated. These authorization msgs are not (by design) going to be in the shortest path from 100 to 0, and we need to make sure that all authorization msgs have a path to the root. This may suddenly make the "extended" certificate pool much larger, and hurt storage and bandwidth overhead.
:bomb: New problem: revoking
Revoking write access is an entirely different and complex problem. We could postpone solving it, and just say that once you give write access to someone, you can't revoke it. Revoking seems doable (inspired by our work on excluding members from private groups) and doesn't need to be solved from day one.
:bomb: New problem: feedID and feed root msg
With this new design, we don't really need the "feed ID" as being the public key of a keypair. It can be any random bytes, to identify you. You just need to include the "first authorized peer" in that feed. This again has #16 vibes.
The design of the feed format could change now, we could treat all tangles literally the same way, with no special treatment for the "feed tangle" versus a "thread tangle" or other kinds of tangles. This is open, I need to sketch what this would look like in code.
:bomb: New problem: main keypairs vs other keypairs
In the tangle auth system described, the main pubkey (from the keypair that defines the feed ID) is different to all other "authorized" pubkeys. To discover a subfeed, you just need the main's pubkey and the "message type" string, then you can deterministically determine the feed root msg.
What this means is that you could authorize one of these "other" keypairs for write access in one feed, but they would never have the access to start new subfeeds, because to start a new subfeed you have to sign the (deterministically predictable) root msg with the main keypair.
:studio_microphone: Feedback
Thoughts about this? @arj03 @ahdinosaur @gpicron