carmelatroncoso commented 6 years ago

Hi,

here is my first pass, up to the Setup Verified Contact protocol.

Comments and questions:

Messaging Layer: this concept is not known, and I believe is ill defined. I tried to make it more explicit in the adversary model of the Summary. Please check that I have reflected your idea correctly, and let's try to make it even more explicit.
We say that we assume that peers are honest. Yet, in ClaimChains we speak a lot about equivocation and spend long time trying to avoid it (including Azul's fixes accross blocks). What exactly do we want to say? Are they honest for some cases and ClaimChains enable us to consider them malicious?
There is this sentence in the intro of Section 2 "The described protocols are decentralized in that they describe ways of how peers (or their devices) can interact with each other". I do not understand what you mean so I could not edit it to make it clearer.
INVITENUMBER and AUTH need to be better defined. Are they random? How are they generated? What is their length? This is very important for the security discussion.
For the implementation it is very important that the encryption scheme is NOT MALLEABLE (i.e., the adversary cannot flip bits to change the plaintext message). Otherwise she may be able to tamper with AUTH or BobFP, breaking the system. Authentication of the messages has to be in place to ensure that it has not been tampered with.

Regarding your open questions:

INVITED and AUTH are associated (if I understand correctly, Alice uses INVITE to index AUTH such that she can verify). Thus, they cannot be independently generated. If they can be repeated and publicly available, you have no guarantee against impersonation. In the Bob Impersonation analysis, the adversary would be able to forge a message with the correct AUTH value, and thus present Alice with a Bob_FP that passes her check in step 5.

contact requests indistinguishable. Need to read more, but maybe not possible: it is the first contact between these users... so I believe that this piece of metadata leaks the content, and we cannot do anything to avoid it.
I do not have knowledge about your third question :)

azul commented 6 years ago

We say that we assume that peers are honest. Yet, in ClaimChains we speak a lot about equivocation and spend long time trying to avoid it (including Azul's fixes accross blocks). What exactly do we want to say? Are they honest for some cases and ClaimChains enable us to consider them malicious?

One way I have been thinking about that is as a mitigation for successful mitm isolation attacks. If an attacker manages to completely isolate me by performing mitm attacks on all my connections they could send gossip on my behalve and make use of it to start or hide other attacks.

With Claimchains in place the provider has two options:

behave consistently: this means it cannot equivocate about alice and bobs keys when i cc them.
accept warnings: For example the provider could still gossip wrong keys to alice and bob when cc'ing them to initiate a mitm attack on them. However this would lead to higher 'scores' for verfication between the two of them and each of them and me because of the inconsistencies.

We can also note that Claimchain helps defend against even stronger adversaries and allows warning in cases of equivocation performed by malicious peers.

azul commented 6 years ago

INVITENUMBER and AUTH need to be better defined. Are they random? How are they generated? What is their length? This is very important for the security discussion.

My understanding so far is that they are random and a few bytes long. If we assume 5 bytes for AUTH - that's 40 bit and thus a 1 in a trillion (1024^4) chance for the attacker. There should be no random error cases here. We can err out of the process on the first failed attempt.

The underlying tradeoff here is that QR codes have a limited amount of data we can include and become harder to scan the more data we include. This will probably become a non-issue once we have ECC keys and need lot less data for the fingerprint.

For the implementation it is very important that the encryption scheme is NOT MALLEABLE (i.e., the adversary cannot flip bits to change the plaintext message). Otherwise she may be able to tamper with AUTH or BobFP, breaking the system. Authentication of the messages has to be in place to ensure that it has not been tampered with.

We need to clarify the properties of the underlying crypto such as OpenPGP. My understanding is that RSA+AES is NOT MALLEABLE. So in the concrete case we are fine here. Maybe we can have a section summarizing our assumptions about the protocol.

We want to use the AUTH for Authentication - So we have no way to protect against tamparing because we do not know Bob's FP at that point. The idea is more that any tampering would change AUTH or render the message broken.

update: ouch... looks like pgp encryption on it's own does not guarentee non malleability. There's quite a bit of complexity involved with the session key being RSA encrypted and then AES applied to the plaintext. But it looks like neither of them provides non-malleability out of the box.

update: My understanding is that OpenPGP will use AES in Cipher Feedback Mode. I have not found any claims about malleability of AES in that mode. So I assume it's malleable - or at least it has not been proven not to be. Looks like OpenPGP seems to work around that with MDC (Modification Detection Code) (section 5.13 and 5.14 of RFC 4880).

That seems to be a SHA-1 hash of the plaintext plus some other things. Does that provide non-malleability? Sorry if the answer is obvious... finding it hard to wrap my head around this.

The rfc states...

Despite the fact that it is a relatively modest system, it has proved itself in the real world. It is an effective defense to several attacks that have surfaced since it has been created. It has met its modest goals admirably.

Neal says:

The MDC system is intended to prevent malleability without requiring a signature. Modern openpgp implementations reject SED packets (encryption packets that don't use an MDC) Unfortunately, gpg still outputs them. So it is up to the caller to check for an error code In sequoia, we don't even process non-MDC protected messages (SED packets)

This sounds like we should add a note for implementers that they need to ensure non-malleability. But they actually do have a way to do so that is supported by the existing OpenPGP implementations.

hpk42 commented 6 years ago

On Tue, May 08, 2018 at 16:15 -0700, Carmela Troncoso wrote:

Comments and questions:

Messaging Layer: this concept is not known, and I believe is ill defined. I tried to make it more explicit in the adversary model of the Summary. Please check that I have reflected your idea correctly, and let's try to make it even more explicit.

OK but also wrt to mails with harry we can now use a "network" adversary and then "message layer" is just one part of that.

Out-of-band and Trusted channels would be synonmys (the one reflecting UI, the other a more cryptographic standpoint) and be defined as: can not be observed or modified by the network layer.

Our key verification protocols work via untrusted channels but depend on an initial bootstrap/data transferal over a trusted channel. They result in "authenticated encryption" (which we currently don't mention but maybe should?) regarding messages in "verified" channels. IOW, the key verification protocols result in authenticated encryption. The UI language uses "verified" -- which is common: https://ssd.eff.org/en/module/key-verification

As the paper is to be read by crytographers and implementors and experienced users it's good to be clear and consistent about the usage of these terms. (we currently are not fully, i am afraid -- this is all just getting clearer to me as we type).

So maybe add definitions for the term in the "attack model" first section of the summary?

INVITED and AUTH are associated (if I understand correctly, Alice uses INVITE to index AUTH such that she can verify). Thus, they cannot be independently generated. If they can be repeated and publicly available, you have no guarantee against impersonation. In the Bob Impersonation analysis, the adversary would be able to forge a message with the correct AUTH value, and thus present Alice with a Bob_FP that passes her check in step 5.

INVITENUMBER and AUTH are unrelated and used for different purposes.

INVITENUMBER: allows an "inviter" to accept and reply to internal messages (mainly step 2: Bob sending a first internal message which Alice replies to). Usually a new contact's e-mail would land in "Contact requests" and needs to be manually confirmed. If we allow Step2 messages to cause an automated reply it could be abused for detecting if somebody is online ... that's at least the current thinking.

contact requests indistinguishable. Need to read more, but maybe not possible: it is the first contact between these users... so I believe that this piece of metadata leaks the content, and we cannot do anything to avoid it.

we think we can eventually get an initial contact message between Delta and Delta to look the same for both cases: opportunistic or verified contact setup. The invitenumber etc. are encoded into the Message ID which is anyway random. It requires careful thinking and also modificiation of current Delta logic but it's a worthwhile go to get this property, isn't it? This way it's much harder to do blanket/larger scale MITM attacks.

hpk42 commented 6 years ago

On Wed, May 09, 2018 at 02:36 -0700, azul wrote:

INVITENUMBER and AUTH need to be better defined. Are they random? How are they generated? What is their length? This is very important for the security discussion.

My understanding so far is that they are random and a few bytes long. If we assume 5 bytes for AUTH - that's 40 bit and thus a 1 in a trillion (1024^4) chance for the attacker. There should be no random error cases here. We can err out of the process on the first failed attempt.

a mail with a non-matching invitecode is dropped but we still want to continue listening. Bjoern told me that he currently uses 66 random bits for each of INVITENUMBER and AUTH. That gives 11 base64-encoding characters for each of the codes. If we can get away with less ... that'd be good!

The underlying tradeoff here is that QR codes have a limited amount of data we can include and become harder to scan the more data we include. This will probably become a non-issue once we have ECC keys and need lot less data for the fingerprint.

we could directly include the ECC key instead of a fingerprint, maybe. But that's out of scope for now: RSA3072 is the baseline and can not be contained in an QR code.

carmelatroncoso commented 6 years ago

Thanks for the answers!

One way I have been thinking about that is as a mitigation for successful mitm isolation attacks. If an attacker manages to completely isolate me by performing mitm attacks on all my connections they could send gossip on my behalve and make use of it to start or hide other attacks.

With Claimchains in place the provider has two options:

behave consistently: this means it cannot equivocate about alice and bobs keys when i cc them. accept warnings: For example the provider could still gossip wrong keys to alice and bob when cc'ing them to initiate a mitm attack on them. However this would lead to higher 'scores' for verfication between the two of them and each of them and me because of the inconsistencies. We can also note that Claimchain helps defend against even stronger adversaries and allows warning in cases of equivocation performed by malicious peers.

I understand this, but it does not really address the inconsistency (or hint how you want to address it). Right now in ClaimChaim we speak about equivocation by users. If we want to stick to your story then this needs to be changed, and only added as a clarification at the end.

An alternative, is to actually consider non-honest users in the attacker model. Does this actually break any of the other defenses?

My understanding so far is that they are random and a few bytes long. If we assume 5 bytes for AUTH - that's 40 bit and thus a 1 in a trillion (1024^4) chance for the attacker. There should be no random error cases here. We can err out of the process on the first failed attempt.

The underlying tradeoff here is that QR codes have a limited amount of data we can include and become harder to scan the more data we include. This will probably become a non-issue once we have ECC keys and need lot less data for the fingerprint.

Those numbers sound good to me. The comment was mostly that this should be specified and reasoned about in the security analysis.

This sounds like we should add a note for implementers that they need to ensure non-malleability. But they actually do have a way to do so that is supported by the existing OpenPGP implementations.

Yes, this was my point. The text is too lose in this sense (in any case hopefully the crypto people from INRIA can help much in this respect).

OK but also wrt to mails with harry we can now use a "network" adversary and then "message layer" is just one part of that.

The question here is that I am still not sure what "message layer" means. I only know about application, transport, and network layers, and I am not sure which one you refer to. I would say network and application, but you tell me.

Our key verification protocols work via untrusted channels but depend on an initial bootstrap/data transferal over a trusted channel. They result in "authenticated encryption" (which we currently don't mention but maybe should?) regarding messages in "verified" channels. IOW, the key verification protocols result in authenticated encryption. The UI language uses "verified" -- which is common: https://ssd.eff.org/en/module/key-verification

I do not think you get "authenticated encryption" in the cryptographic sense (https://en.wikipedia.org/wiki/Authenticated_encryption). Using that term is dangerous.

Regarding the rest of the vocabulary as far as I read I think it is pretty consistent.

INVITENUMBER and AUTH are unrelated and used for different purposes.

INVITENUMBER: allows an "inviter" to accept and reply to internal messages (mainly step 2: Bob sending a first internal message which Alice replies to). Usually a new contact's e-mail would land in "Contact requests" and needs to be manually confirmed. If we allow Step2 messages to cause an automated reply it could be abused for detecting if somebody is online ... that's at least the current thinking.

Yes, I know that they are different and have different purposes. The question is how they are used in the implementation.

Even if they are separated, AUTH cannot be repeated, so I am not sure it makes sense to have INVITENUMBER printed out.

we think we can eventually get an initial contact message between Delta and Delta to look the same for both cases: opportunistic or verified contact setup. The invitenumber etc. are encoded into the Message ID which is anyway random. It requires careful thinking and also modificiation of current Delta logic but it's a worthwhile go to get this property, isn't it? This way it's much harder to do blanket/larger scale MITM attacks.

I don't get this, but I think is because I am still not familiar with the other protocols.

Yes, if we can get it, indeed it is worthy.

azul commented 6 years ago

Hi Carmela,

Thanks for your comments. I was thinking authenticated encryption was equivalent to confidentiality, integrity, and authenticity - but reading the wikipedia article fully i understand it also refers to a specific API. So I guess we could talk about 'Verified Contacts' and 'Verified Groups' that once setup provide confidentiality, integrity and authenticity. Is that correct?

azul commented 6 years ago

@carmelatroncoso I checked with @r10s and it looks like we are using 66 bit for the challange sizes right now. I created a pr to alter your branch to reflect this here: #41. Writing 'at least 8 bytes' there as it matches the actual implementation and I think specifying a lower boundary should be enough.

I did not push to this branch directly to avoid conflicts. If you accept that pull request the change will be reflected here.

azul commented 6 years ago

42 adds the non-maleability requirement.

carmelatroncoso commented 6 years ago

I went through another protocol.

New questions:

Is it a problem that a rejected user can still send messages to the group? Even if she cannot read because her key is removed by the peers, she does know the keys of the others. Therefore she can still write. This can create a lot of confusion. Should this be solved at the user level, or should Autocrypt make sure that it does not happen?

Leaving attackers in the dark about verified groups.

I do not understand any of this part. What is the threat?

Non-messenger e-mail apps:

I do not understand. What is the vision of other apps? When would this be used?

carmelatroncoso commented 6 years ago

So I guess we could talk about 'Verified Contacts' and 'Verified Groups' that once setup provide confidentiality, integrity and authenticity. Is that correct?

Yes, I think this is correct

carmelatroncoso commented 6 years ago

Finished the protocols.

Usability question of "sticky" encryption and key loss Do we want to prevent dropping back to not encrypting or encrypting with a different key if a peer's autocrypt key state changes? Key change or drop back to cleartext is opportunistically accepted by the Autocrypt Level 1 key processing logic and eases communication in cases of device or key loss. The "setup-contact" also conveniently allows two peers who have no address of each other to establish contact. Ultimately, it depends on the guarantees a mail app wants to provide and how it represents cryptographic properties to the user.

In my opinion opening so much the choices without an analysis is very dangerous. A way to study this may be to have a table representing the different cases in which there could be a fallback and highlight the dangers.

A related question is: can an administrative message sent out of order cause a fallback?

hpk42 commented 6 years ago

I like the protocol changes ... for me it's fine to merge the branch.

Instead of trying to answer here on the long-winding PR comment threat, i'd like to make a pass over the "protocol" and summary section to refine and answer some issues (also striking some paras which are think are superflous). going to do this in a PR to this branch here again.

carmelatroncoso commented 6 years ago

Cool, I will only work on claimchain.rst (when I get to it at some point this afternoon). So merging is good. If it has been merged by then I will start a new PR with those changes.

nextleap-project / countermitm

edited summary and secure verified communication protocol #40

42 adds the non-maleability requirement.