Addressing the various security improvements

synctext commented 9 years ago

Several issues where listed by "Yawning Angel" on Tor mailing list.

A casual review brought up some issues and we here provide a structured reaction to them. All issues are addressed already or will be fixed soon. In general, the reviewer is correct, our code needed to be written more clearly and dead code was insufficiently cleaned. Issues:

[x] issue: How not to do Diffie-Hellman: key = pow(dh_received, dh_secret, DIFFIE_HELLMAN_MODULUS) which relied on gmpy.rand which was improperly seeded
Reaction: We're actually not sure if this was the case, but we change the seeding of the gmpy.rand method to make sure were always using a seeded rand now. Moreover, we aim to replace this custom DH with the one implemented in M2Crypto asap.
Reaction2: The gmpy.rand is cannot be saved, removed all references to it.
[x] issue: one-time AES keys are generated with python's random.randint()
Reaction: Fortunately not used for long-term keys, cause by the now famous ImportError which prevented PyCrypto from being imported.
[x] issue: How not to do RSA: def rsa_encrypt(key, element):
Reaction: RSA was never used in the tunnels, implemented while researching homomorphic encryption. This was for a paper published in WIFS 2013 http://dx.doi.org/10.1109/WIFS.2013.6707798. In this paper we implemented/evaluated three different approaches to Private Set Intersection-problem and tested their applicability in a P2P system. One of these approaches used RSA, which if used in unpadded mode (http://en.wikipedia.org/wiki/Homomorphic_encryption#Unpadded_RSA) has homomorphic properties. However, neither M2Crypto/PyCrypto allowed us to generate such an compatible key. Therefore, we wrote a small piece of python which allowed us to do so, hence the "compatible_key" method https://github.com/Tribler/tribler/blob/devel/Tribler/community/privatesemantic/crypto/rsa.py#L23. This shouldn't be used in the wild, and wasn't/isn't used in the tunnels. A pull request is submitted which fixes the dodgy optional_crypto file, by removing the optional part.
[x] issue: ECB-AES128 usage in the code. Yes, we are aware of this issue. ECB-AES128 needs to be replaced asap with our new code that adds packet loss resilience. We aim to use CTR mode instead, and compensate for UDP packet loss with optimistic decryption that uses sequence number estimation, please read our detailed report: http://repository.tudelft.nl/view/ir/uuid%3Ace3bd867-6540-426d-87d0-348bdf78279d/

Yawning commented 9 years ago

For the record, I have nothing against the concept of what you all are trying to do, and think it's an admirable goal, and do wish you all the best of luck. That said, there is a lot of work that needs to go into certain kinds of systems before real users should see them.

With regards to the improve_crypto branch, here, have some followup:

Yay, a lot of the old code is removed so this is easier to follow.
Ok, using pycrypto's StrongRandom() is a reasonable thing to do.
Diffie-Hellman key generation is still catastrophically broken, and it is possible that it got worse. On a Linux system, with glibc, the first DH private key generated (assuming the code is run in isolation, eg in a unit test) will always be: 0xb43d2488d8be2449e737dd6308168ab2c22668e57cb1a10a140ebabcc01ab988b331289d248731af85720ddbee86609da2877b56962c84c8f481e85e942a5b6e871add8296accc54880a785ef1f5d466e0632dd76b65fccff6b69d1af7a99d9e0490f300829c0c2960bdd3fe3d47d17f135e32f79e32c6bbe65f965d63185694. There's nothing particularly magical about "first" (The gist linked has 10), and with the changes in the branch, key generation is weak to the point where link layer security is non-existent.

See: https://gist.github.com/Yawning/4cca041c456ccf113700 (Note: Example depends on your libc's rand() matching mine, version included in the gist.)
The ECElgamal code uses "rand('next', 10000)" to calculate k (Wow, I can't believe I missed this in my original writeup). Given that "rand('next', 10000)" is not cryptographically secure, there is no security here as recovering the plaintext is trivial.

Really, gmpy's "rand()" does not belong anywhere near cryptography, except in dire warnings such as this sentence.
Re: the link crypto in the thesis. I don't have time at the moment to give this thorough review, but it scares me, because Not-Invented-Here approaches to cryptography SHOULD scare people, especially when doing things that do have time tested solutions (IPsec, DTLS, i2p, CurveCP...), and at a quick glance it seems over-complicated (and probably broken in interesting ways). The fact that opportunistic decryption enters the picture is particularly troublesome because none of other protocols require such steps.

Obligatory disclaimer: Even fixing the further issues is probably still not enough to protect the system from adversaries, including those in your threat model. Without fixing said issues, both master and the improved branch (which does have some nice cleanups), will not protect vs a bored undergrad.

NielsZeilemaker commented 9 years ago

@Yawning many thanks again for these comments. I've removed the custom DH and replaced it by the one included in M2Crypto/OpenSSL. Again, why we choose to implement DH ourselves instead of using this version is beyond me.

Regarding the gmpy's rand method, after reading the docs I was really under the impression that letting gmpy set the seed would suffice. But its clear now that it does not. I'm going to refactor/remove all references to rand.

NielsZeilemaker commented 9 years ago

@Yawning all references to rand are now gone, additionally the ECElgamal code now uses a k which is relatively prime to the modulus of the curve.

Yawning commented 9 years ago

@NielsZeilemaker Glad to hear it. What's most important to me is that all these things get fixed, and I'm glad to see that you're working hard (though you should also remember to enjoy the holidays).

If you want someone to talk to about how to do the symmetric cryptography in a manner suited for the UDP transport, feel free to drop me an e-mail (at either the one I used to post to tor-dev or at my torproject.org one), since it's a problem space I'm relatively familiar with.

NielsZeilemaker commented 9 years ago

@Yawning I guess the first thing to do now, is replace the custom ecelgamal with one implemented by a library. Do you have a pointer towards a reliable library which implements it?

whirm commented 9 years ago

I'm moving this to v6.4.2 milestone as we are releasing v6.4.1 today.

whirm commented 9 years ago

Note that 6.4.1 has all the fixes except for the ECElgamal replacement.

NielsZeilemaker commented 9 years ago

@Yawning I was thinking to replace the custom ecelgamal stuff with an approach describe here http://stackoverflow.com/a/1175343 E.g. using ECDH combined with a one-time EC key to generate a AES key to be used to encrypt a packet. What do you think of this approach? We already have the public key of the receipient, so I guess this should work. And ECDH is included in OpenSSL/exposed by M2Crypto

Yawning commented 9 years ago

Hmm, if you're ok with breaking wire protocol compatibility, I included the suggestion to use the newer ntor handshake in my initial draft. I'll expand a bit more on that, since while what you suggest will work, there's no need to reinvent the wheel.

"ntor" is the current Tor link layer crypto handshake, that was introduced to address certain issues with the older TAP hybrid Diffie-Hellman/RSA handshake (which is what your current DH/ECElgamal handshake is based on). It is Curve25519/SHA-256 based (the primitives don't matter too much, though for performance reasons the key exchange algorithm should be an ECDH variant, and I tend to favor Curve25519 over ECDH with the NIST curves lately). The older TAP handshake is extremely close to being deprecated (0.2.3.x needs to go away first).

The original paper (which includes the security proof): https://cypherpunks.ca/~iang/pubs/ntor.pdf The Tor spec portion that details how we implement it: https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n868

You'll need a Curve25519 implementation:

agl's curve25519-donna has a convenient python module, though it's a bit old.
python-tweetnacl, which has a bunch of other useful primitives.
Probably others, I don't write much Python lately.

How I think you all would implement it if you decide to go this route (Using variables from our spec, not the paper):

Setup:

Every node generates a long term Curve25519 key (b,B), and a 20 byte unique identifier (NODEID) on startup. You could make these persist between sessions, all our relays do so.
They publish B and NODEID somehow with the magic Dispersy component that I haven't looked at much, such that people who wish to open connections to them can get at the values.

The handshake (Alice is the initiator, Bob is the recipient):

Alice obtains Bob's B and NODEID.
Alice generates an ephemeral Curve25519 keypair (x,X).
Alice sends an EXTEND cell to Bob, with the following payload:
- Bob's NODEID
- B
- X
Bob decodes the EXTEND cell and validates that Alice has the correct NODEID and B.
Bob generates an ephemeral Curve25519 keypair (y,Y).
Bob completes their side of the ntor handshake, obtaining the shared secret and verification tag.
Bob sends an EXTENDED cell, with the following payload:
- Y
- AUTH
- (Additional data if you need to send anything, Bob's done with the handshake so they can append encrypted payload...)
Alice decodes the EXTENDED cell, and does their side of the handshake, obtaining the shared secret and their derived verification tag.
Alice verifies that the AUTH value they derived matches the one contained in the EXTENDED cell.

After step 6 (Bob) and step 9 (Alice), both sides have KEY_SEED that is identical (256 bits of material), and Alice is confident that Bob is actually Bob. I suggest using HKDF-SHA256 to expand that to actual session keys (The post you linked uses SHA-1 ewww), and using separate keys for Alice->Bob and Bob->Alice traffic at a minimum.

It's relatively simple to implement and there aren't that many implementation pitfalls unique to this, though I will note some of them here:

Key generation should use a CSPRNG (I'm fairly sure this is a given at this point).
"Both parties check that none of the EXP() operations produced the point at infinity."
Validating the AUTH tag in Step 9 should be constant time compare. This is kind of annoying to do in Python, so you can use a technique know as [https://www.isecpartners.com/blog/2011/february/double-hmac-verification.aspx]("Double HMAC Verification") to defend against the timing attack.

Possible improvements over what Tor currently does:

There's another 1W-AKE handshake protocol known as https://www.infsec.cs.uni-saarland.de/~mohammadi/paper/owake.pdf that sends slightly more client to server payload, but has less computational cost. If you're feeling brave, you could implement it from the paper, although in practice, ntor is quite fast especially coming from using Diffie-Hellman previously.

synctext commented 9 years ago

@Yawning Thank you for the review! We got a bit more exposure lately then we expected. Hopefully in a few months we can do another review round and share the progress with the tor-dev people. Hopefully things can mature in a few releases.. We now have a solution for the exit node problem, but that will take some time. Our focus will be on the incentive system to relay, see our Bartercast work on Scholar.google. In 2015 that should hopefully be ready to go into production.

NielsZeilemaker commented 9 years ago

@Yawning Many thanks from me as well. However, I have one question. After obtaining the shared secret, we can start using AES to encrypt the packets. Currently, we're using ECB which has some obvious flaws, but in order to switch to CBC we need to implement packet retransmissions/reordering. Something we're hoping to avoid. Do you have a suggestion which might allow us to get similar security as CBC, but without implementing packet retransmission/reordering? Or is there simply no alternative to CBC.

Yawning commented 9 years ago

Retransmission/reordering shouldn't come into the picture assuming you're willing to incur extra overhead on a per hop per packet basis. The current design does allow arbitrary hops, but that's not a great idea due to attacks highlighted in my initial analysis (Tor explicity does not allow users to configure the path length, see https://www.torproject.org/docs/faq.html.en#ChoosePathLength and http://freehaven.net/anonbib/cache/ccs07-doa.pdf).

If we are ok with this (it's kind of unavoidable, unless you want to get into opportunistic decryption, which is performance intensive and kind of scary in the pathological cases), we might as well also fix the authentication issue while we're at it.

That said here is an example of what I would do if I had to use AES (In this case GCM-AES), with the usual caveat that it's off the top of my head and will probably need tweaks.

First, let's reuse the nonce definition from RFC 5288.

struct {
  uint8_t salt[4];
  uint8_t nonce_explicit[8];
} nonce;

The salt portion is fixed per circuit (Derive separate ones for Client to Server/Server to Client as part of the KDF output during the handshake), the nonce_explicit is a counter in network byte order that is initialized to 1, and incremented per packet. Extremely bad things happen when nonces are reused for GCM mode, so either kill the circuit when it wraps or implement rekeying (hard).

Change the cell format to be something like:

struct {
  uint8_t nonce_explicit[8]; /* The explicit portion of the GCM mode nonce */
  uint8_t auth_tag[16]; /* GCM authenticator tag */
  struct {
      uint32_t circuit_id; /* Same as current */
  } adata; /* GCM ADATA (sent in the clear) */
  uint8_t pdata[]; /* GCM PDATA aka payload (encrypted) */ 
} cell;

The decryption flow looks like this:

Peek at circuit_id to figure out which key and nonce.salt to use.
Derive the nonce used to encrypt the packet nonce.salt (know), nonce_explicit from the packet.
ok, adata, pdata = GCM-AES-Decrypt(k, n, adata | pdata)
If the decryption fails, drop the packet.
Ta da, you have a valid circuit_id and plaintext payload.

Retransmission by the utp layer doesn't matter, because the tunnel code treats a retransmitted packet just like any other data (increment nonce_explicit, if it didn't wrap, send). Reordering doesn't affect successful decryption because nonce_explicit is included in each packet, so the receiving end knows everything it needs to authenticate and decrypt the payload.

Pitfalls, downsides, recomendations:

Don't implement GCM-AES on your own. GHASH is hard to do correctly.
nonce reuse across a given key massively weakens the security, always increment the counter, and check for wrapping to avoid this.
This consumes 24 more bytes per hop (nonce_explicit and the GCM tag). Check the size of your tunneled utp traffic and make sure this is ok (with 3 hop circuits it's 72 bytes which isn't terrible).
You should look into implementing replay detection. This is easy to do due to nonce_explicit providing a handly always incrementing sequence number that never wraps (you can do what IP ESP and DTLS does, with a 64 bit window).

Season's greetings.

NielsZeilemaker commented 9 years ago

@Yawning I've got two question for you

I've implemented support for Curve25519 in Dispersy, which will allow nodes to discover eachothers public keys. However, in Dispersy we don't really have a NodeIDs. We use the sha1 of the public key as the nodeid of a peer. I guess this doesn't make a lot of sense to include in step 3 of the ntor handshake. Do you think leaving the nodeid out would be a problem?
For the GCM-AES I thinking to use the cryptography pyhton package, as that exposes the openssl GCM methods, see https://cryptography.io/en/latest/hazmat/primitives/symmetric-encryption/#cryptography.hazmat.primitives.ciphers.modes.GCM. And I correct in thinking that the nonce you're referring to is the initialization_vector in this case? And if so, how should I derive this nonce/iv, I was thinking to use https://cryptography.io/en/latest/hazmat/primitives/key-derivation-functions/#cryptography.hazmat.primitives.kdf.hkdf.HKDF setting salt to be nonce.salt and using nonce.explicit as the input for the derive method.

Thanks again

Yawning commented 9 years ago

@NielsZeilemaker Hihi.

Hmm, since you have the sha1 digest of the key, you can use that for NodeID. The tor usage is the SHA-1 digest of the node's RSA onion key, but this being the curve25519 key is ok. In general having an explicit component of the key exchange output that never goes across the wire is kind of nice.
Indeed, initialization_vector is the nonce in my writeup. So, after doing the ntor handshake, you have KEY_SEED which contains 256 bits of keying material shared by the client and server. HKDF-SHA256 is a fine choice (and is what we use).

To convert that into session keys:

def kdf_ntor(self, key_seed):
  # key_seed contains the output from the ntor handshake.

  # For the sake of the example I will assume GCM-AES128, with 4 bytes of nonce_salt.

  # Certain parameters are hardcoded. (See tor-spec.txt 5.1.4)
  PROTOID = b"ntor-curve25519-sha256-1"
  info = PROTOID + b":key_expand"
  t_key = PROTOID + b":key_extract"

  # Extract/expand key_seed into the key/iv material. (See tor-spec.txt 5.2.2)
  hkdf = HKDF(
    algorithm=hashes.SHA256(),
    length= (16 + 4)*2,
    salt=t_key,
    info=info,
    default_backend=default_backend()
  )
  key_material = hkdf.derive(key_seed) # key_material contains 2x(key + nonce_salt).

  # Stash the key/nonce_salt pairs.  The HKDF output looks like:
  #  uint8_t key1[16]
  #  uint8_t nonce_salt1[4]
  #  uint8_t key2[16]
  #  uint8_t nonce_salt2[4]
  #
  # For extra paranoia, use key1/nonce_salt1 for client->server traffic, and
  # key2/nonce_salt2 for server->client traffic.
  if isClient:
    self.tx_key = key_material[0:16]
    self.tx_nonce_salt = key_material[16:20]
    self.rx_key = key_material[20:36]
    self.rx_nonce_salt = key_material[36:40]
  else:
    self.rx_key = key_material[0:16]
    self.rx_nonce_salt = key_material[16:20]
    self.tx_key = key_material[20:36]
    self.tx_nonce_salt = key_material[36:40]

  # Initialize the outgoing nonce_explicit counter.  The incoming counter is included
  # in each packet, though you should examine it to make sure that someone isn't
  # replaying packets.
  self.tx_nonce_explicit = 0

  # If this was C code, I would clear key_seed/key_material here, but I'm not sure
  # if there's a point doing so when writing python code...

Building the iv looks like:

from struct import pack, unpack

def build_tx_iv(self):
  # Increment the nonce_explicit counter.
  self.tx_nonce_explicit = (self.tx_nonce_explicit + 1) & 0xFFFFFFFFFFFFFFFF
  if self.tx_nonce_explicit == 0:
    raise ValueError('TX nonce_explicit wrapped')
  e_packed = pack('>Q', self.tx_nonce_explicit)

  # Return a tuple containing:
  #  * The full IV (to pass into the GCM-AES implementation).
  #  * The packed representation of nonce_explicit (to insert into the packet).
  return (self.tx_nonce_salt + e_packed, e_packed)

def build_rx_iv(self, e_packed):
  # e_packed comes out of the packet.
  e_unpacked = struct.unpack('>Q', e_packed)[0]
  if e_unpacked == 0:
    raise ValueError('RX nonce_explicit wrapped???') # 0 should never go on the wire.

  # Return a tuple containing:
  # * The full IV (to pass into the GCM-AES implementation).
  # * The unpacked (integer) representation of nonce_explicit (to use in replay detection, etc).
  #   Do the replay detection before decrypting (the sliding window scheme that IPsec/DTLS
  #   uses is quite fast.), and drop duplicated frames as appropriate.
  return (self.rx_nonce_salt + e_packed, e_unpacked)

NB: As usual this is off the top of my head, and I haven't written python lately. It should get the point across.

whirm commented 9 years ago

@synctext I guess this is not going to be finished for 6.4.2, right?

NielsZeilemaker commented 9 years ago

@whirm If you add the new dispersy pointer to next, I'll have a go at changing the tunnel community this weekend to see if I can get it too work with the new curve25519.

whirm commented 9 years ago

@NielsZeilemaker I guess you can branch off that branch and send one back to devel when it works :) http://jenkins.tribler.org/job/GH_Tribler_pull-request-tester_devel/3143/

synctext commented 9 years ago

@Yawning Creation of fake accounts remains a key vulnerability for self-organising systems in general and thus Tribler. I have now a draft 10-page write-up of how we aim to deal with Sybil attacks (3 defense layers). The recent Lizard Squad events make this quite relevant it seems. Have you worked on this research topic?

@NielsZeilemaker impressive progress (https://github.com/Tribler/dispersy/pull/385). Thnx!

@whirm for 6.4.2 I guess we just ship it when we have sufficient bug fixes and improvements.

NielsZeilemaker commented 9 years ago

@Yawning I made most of the changes in a new pull request #1117. Some remarks

I'm not ckecking if the EXP() operations produced the point at infinity. I was hoping nacl is doing that for me crypto_box_beforenm method, is that right?
I'm checking if the explicit nonce is overflowing, but not removing the circuit just yet.

Could you have a look if I made any blatant mistakes?

synctext commented 9 years ago

@Yawning Hopefully the identified pressing issues in the code are fixed with Pull request #1188 by Niels.

We're testing this now and are thinking of disabling the "exit" node functionality. Only offer downloading from hidden seeds in upcoming versions.

whirm commented 9 years ago

@synctext: @NielsZeilemaker said that this was basically done. Can you close this if it's OK?

Baigle-zz commented 9 years ago

Any thoughts? http://arxiv.org/abs/1507.05724v1?utm_source=webopsweekly&utm_medium=email

whirm commented 9 years ago

I'm closing this, one, @synctext: feel free to reopen it if it's not really finished.

synctext commented 6 years ago

An update on the 2018 status of these security matters for people which followed a link from 2014 or beyond.

Dispersy was correctly identified by Yawning Angel as a vulnerability. We have build a replacement. We are now in the process of removing Dispersy for our new overlay.

dreamflasher commented 3 years ago

@synctext Was the last point ECB-AES128 usage in the code ever fixed? In 2014 you wrote this needs to be fixed asap – but the ticket was closed without ticking the checkbox/a later reference.

egbertbouman commented 3 years ago

@dreamflasher We moved to AES-GCM back in 2015, and more recently we started using ChaCha20/Poly1305. We just forgot to tick the checkbox.

dreamflasher commented 3 years ago

@egbertbouman Thank you very much for the swift update, I am very happy to hear that!

Tribler / tribler

Addressing the various security improvements #1066