ssbc / ssb-db

A database of unforgeable append-only feeds, optimized for efficient replication for peer to peer protocols
https://scuttlebot.io/
MIT License
1.17k stars 75 forks source link

Discussion on encrypted messages #82

Closed krl closed 9 years ago

krl commented 9 years ago

Allowing entries in ssb to be fully encrypted would have some interesting properties.

Private messaging could work in a similar way to how bitmessage handles hiding of metadata. You just append a message to your feed, that is encrypted to a certain user, without revealing which user it is.

Everyone syncing will then try to decrypt the message, or maybe only a fixed-size header for performance reasons.

This can give you pretty reliable messaging, without attackers learning very much, except by correlating activity between the actual participants in the discussion. Here, some kind of 'chaff' would make sense. i.e. users have a privacy incentative to just periodically post noise in their stream, to stop these kinds of correlation attacks.

Further along, exploring ways of having multiple parties deriving a shared secret, maybe even off the chains, would also allow group messaging.

I think this would make sense to add at the ssb level, and not leave up to applications to deal with. What are your thoughts?

jtremback commented 9 years ago

This is a little off-topic, but maybe related. I'm doing something similar in microstar-internal-chain.

However, it's symmetric encryption (using NaCl). The purpose of this is mainly to store settings and such. For example, the unfinished microstar-replicate uses this to store its list of followed chains (feeds in SSB, I call them chains for better name recognition by cryptocurrency folks).

One issue that I ran into is the fact that, of course, encrypted data is not indexable in the db. I am addressing this by making the index documents before encryption. Since SSB has hard coded indexes, while Microstar uses a query language, this challenge may manifest itself differently or not at all. Pretty obvious, but it caught me off guard, and I felt pretty dumb when I realized my mistake.

jtremback commented 9 years ago

Wrestling with the encrypted indexes issue here: https://github.com/microstar-db/microstar-internal-chain/issues/1

Again, not sure how applicable this is to SSB, but there are definitely some parallels I'm guessing.

dominictarr commented 9 years ago

@krl absolutely! I would love to merge a pull request that implements this! We really want end-to-end but there is just so much else to do that @pfraze and I have not gotten to this yet.

krl commented 9 years ago

Crypto braindump:

Any private messaging has a target shared secret symmetric key. This would be 2 people at minimum. Diffie Hellman can be done for groups, but a first implementation could limit it to the simple 2 party case.

An encrypted message could look something like this:

{"type": "crypto",
 "body": "<noise>"}

Upon syncing a message like this, the client tries to decrypt the body with their own private key, and with all shared secrets of groups which you both are a member of.

If decryption succeds, this message will be presented to the higher apis as a normal message, with additional metadata of which shared secret/key was used.

key exchange

To initiate a group of two, and do a key exchange, Alice posts a 'crypto' type message, containing her part in the handshake, encrypted with Bobs public key.

Bob pulls the message, succeeds in decrypting it, sees that it's a key exchange request and payload, and posts his own crypto msg, containing his own part in the handshake, encrypted with Alices public key.

Alice and Bob now compute their shared secret, and starts messaging each other using a symmetric crypto for the body of their posts.

PIR

This protocol seems to have some really nice properties. For one, you cannot directly tell who a message is directed towards, or who is building shared secrets with whom.

This is of course suspicable to timing attacks, even offline after the fact, and to get around this a certain amount of cover traffic, 'chaf', would be needed.

Each node can configure their chaf as they please, so if you're not particularly worried about your friends or enemies finding out who you talk to, just don't enable it.

Having to store lots of entropy that you cannot read should be uninteresting to honest people, so my suggestion would be that the client discards message bodies it cannot read. Just keeping the hash of the message.

implementation

Where do you think this should go? It would be nice to have it on a pretty low level, so that applications get notified only that it was an encrypted message, and which key was used to decrypt it.

I would like to give it a shot.

pfrazee commented 9 years ago

i like these suggestions.

Having to store lots of entropy that you cannot read should be uninteresting to honest people, so my suggestion would be that the client discards message bodies it cannot read. Just keeping the hash of the message.

assuming there's no conflict with the cryptography, blobs could be used to do this. the encrypted content would be linked-to by its hash:

{"type": "crypto",
 "ext":"<hash>", // external link
 "rel":"body"}      // ...to the body

Where do you think this should go? It would be nice to have it on a pretty low level, so that applications get notified only that it was an encrypted message, and which key was used to decrypt it.

i agree

dominictarr commented 9 years ago

I agree that this would be good as a low level thing.

We should be able to do encrypted attachments and encrypted messages, attachments are not guaranteed to be replicated, and peers may not be bothered to replicate blobs they can't read.

How much chaff do you need to avoid a timing attack? On 29 Jan 2015 16:19, "Paul Frazee" notifications@github.com wrote:

i like these suggestions.

Having to store lots of entropy that you cannot read should be uninteresting to honest people, so my suggestion would be that the client discards message bodies it cannot read. Just keeping the hash of the message.

assuming there's no conflict with the cryptography, blobs could be used to do this. the encrypted content would be linked-to by its hash:

{"type": "crypto", "ext":"", // external link "rel":"body"} // ...to the body

Where do you think this should go? It would be nice to have it on a pretty low level, so that applications get notified only that it was an encrypted message, and which key was used to decrypt it.

i agree

— Reply to this email directly or view it on GitHub https://github.com/ssbc/secure-scuttlebutt/issues/82#issuecomment-71962619 .

krl commented 9 years ago

Attatchments have much worse security properties than actual messages in the chains, which we might have to think a bit about, since an isp-level attacker could see which peers request a certain file after recieving the same message.

To avoid timing attacks basically means that you maintain a constant traffic profile, the extreme case being constant bandwith. One way of doing it might be each of the clients having a probability to 'respond' to chaff they get sent, with a random delay. This would mean that larger networks produce more chaff in each chain though.

This is probably something we might leave out for the first versions, since getting this right is probably gonna need some auditing.

dominictarr commented 9 years ago

@krl you are right. If we had attachments, and wanted to have anonymous attachments, peers would have to download attachments that are probably not for them, so that no body can tell who they are actually for... this also means that attachments must be attached in the clear, so that peers can request them.

Also, would it be simpler to just use asymetric encryption for the keys, or otherwise, you could put your half of the DH exchange in the first message, beside the public key, and then combine your key with their key when you want to send a message to someone.

@krl if messages have no "to:" field, and clients must check every message for whether it is for them, what is the overhead of the various techniques that could be used?

krl commented 9 years ago

If attachments are in the clear, then there must also be no way of knowing if it's for you until you download it, otherwise this becomes a moot point.

I have not found any documentation/info about the combining of keys that seem possible with ECC, do you know what this technique is called? And does it extend to multi-party?

The benefit of DH though is that you throw away the info used to generate the key, so you have forward secrecy.

For small messages i don't think the overhead of trying to decrypt is relevant. Bitmessage does this, and have been critizised for it, but they also have a everyone-gets-everything scheme, in this case you only get what you explicitly follow.

For larger messages, it might make sense to have a crypto-header, a small value that just decrypts to a constant or something. Then you would only have to try to decrypt a small part of the message.

As for attachments, since they will probably never be very large and frequent, having all followers download them might be viable. Or you could choose to download it only from a certain subset of people you follow, which might be good enough.

dominictarr commented 9 years ago

@krl given that we have persisted chains of data and long term keys, I suspect that we can't have forward secrecy, unless we delete old messages, or at least delete their keys. We could use ssb to bootstrap ephemeral cryptosystems that are forward secret, though.

@krl do you mean El Gamal? https://en.wikipedia.org/wiki/ElGamal_encryption

Maybe decrypt a key, but there is a checksum so you know if it was correct? you could have an encrypted file, and the encrypted key could be part of the ext link - but the link could be a clear link to the attached cyphertext. This way you could request the blob, but the provider won't know whether or you could actually decrypt it or not.

something like this?

var key = getKey(idAddressedTo)
var cyphertext = symmetric_encrypt(content, key)
var cypherkey = assymetric_encrypt(idAddressedTo)
//an attachment...
var link = {ext: hash(cyphertext), rel: 'encrypted', key: cypherkey}
//a message...
var msg = {
  type: 'encrypted',
  cyphertext: cyphertext,
  key: cypherkey
}

There are multiple ways you could implement assymetric_encrypt but it would certainly get you a key that you used with another symmetric encryption.

krl commented 9 years ago

I've done some more thinknig about pm, and i think there might be a better way of doing this.

I'm keeping the discussion on groups of 2, as in private messages, for simplicity. For larger groups some more thinking might be needed.

The model would be, each user one or more public feeds, but in addition to that, each pair of identities wanting to communicate, set up their own ssb chain, in which all messages are encrypted with the symmetric key derived between them.

This has some nice properties, like we don't have to implement special crypto message types, but just use normal feeds that have the encryption only in the transport layer.

The pubservers role would be to relay all encrypted messages a user publishes to the followers of this user. The client will then either find a chain to append the decrypted message to, or discard it completely.

This should be more elegant to implement as well.

Thoughts?

krl commented 9 years ago

So, to clarify, in a pairing, each participant keeps their own chain, and also a copy of the chain of their peer.

jtremback commented 9 years ago

I like this, except for the encryption only in the transport layer part. Having different chains that need to be transported differently sounds unnecessarily complicated. What are the benefits?

dominictarr commented 9 years ago

@krl I'm a little confused. so encrypted messages are just a total broadcast? I'm not sure this is simpler than just having encrypted message bodies.

krl commented 9 years ago

Private messaging rehash

I felt i was not very clear in my last post. I'll try to distill it down to what i think is the important parts.

I'm not sure if putting private messages in your public feed makes much sense at all. One way of doing it differently would be to have one feed for each person you are in contact with.

So, Alice would have one or more public unencrypted/exposed feeds, as well as a feed specially for Bob. Bob in turn would have a feed for Alice. They would sync these feeds with each other over an encrypted transport.

This would have the benefit of not cluttering your main feed with noise, and would not even require any new crypto message types.

If A and B want to sync their states over a pubserver, one way of doing it, without revealing too much where each message goes, is to have a scheme where A sends an encrypted packet with no metadata to the pubserver, who forwards it on to B.

One benefit of this, is that if you want to send noise as cover traffic, it does not have to get appended to any chain at all.

The drawback is that the server will have to store the noise for some time, in order to facilitate syncing Bob up if he's been offline for a longer period of time. If messages in a chain get lost however, Bob can simply ask alice to re-transmit for example by appending a special message to the A->B -feed.

For groups of limited size, you would have the chains A->group , B->group, C->group. Anyone you're not following who is posting to the group will be implicitly ignored.

pfrazee commented 9 years ago

AFAIK making a new feed for each recipient is more of an efficiency decision than a security one. The main benefit is keeping encrypted messages out of the public feed, yeah?

krl commented 9 years ago

Yes, but it does increase security in that not everything has to go over the same pubserver, which makes traffic analysis more difficult.

dominictarr commented 9 years ago

Another idea I'd like to throw I to the mix here is we could have an api to proxy a connection to another server... Tor would fall out of this and it would make implementing schemes like @krl is describing much easier On 18 Feb 2015 06:01, "kristoffer" notifications@github.com wrote:

Yes, but it does increase security in that not everything has to go over the same pubserver, which makes traffic analysis more difficult.

— Reply to this email directly or view it on GitHub https://github.com/ssbc/secure-scuttlebutt/issues/82#issuecomment-74704351 .

jtremback commented 9 years ago

Multiple feeds do allow us to move away from the pub server thing. I could replicate @pfraze's private feed to @krl and @krl could replicate it off of me, without me having to replicate @pfraze's whole entire feed. This way I could just be replicating stuff for @krl even if I don't even care about anything else that @pfraze is publishing.

pfrazee commented 9 years ago

@jtremback how so? pub servers bridge networks that can't form direct connections

dominictarr commented 9 years ago

this is implemented now, mostly over here: https://github.com/auditdrivencrypto/private-box