Add protection against replay attacks

emanuele-f commented 5 years ago

As explained in #70, the current implementation does not provide a protection against replay attacks.

Logan007 commented 5 years ago

I would add this feature by coding something special in the initialization vector (IV) for cipher block chaining (CBC). That way, there is no need for an extra data field. Basically, there are three options:

1) A counter. Just counting it up starting at 0 or some other value maybe derived from the key. The receiver must be tolerant to packet loss or even a longer supernode turndown. Would that require a regular re-synchronisation, maybe with every supernode register? The current counter value could be internally stored in the sa structure.

2) A more sophisticated way of counting using a predictable pseudo random number generator (PRNG), initialized by some common seed, probably derived from the key. Also, the PRNG's paramters could depend on the key. This also would require the receiver to be somewhat tolerant, the checking needs some window in which some PRNGed numbers are precalculated in advance. The sa structure could hold the current status of the PRNG.

3) A time stamp, e.g. the 32 bit unix time stamp with precision of seconds - windows should have something comparable. Given, that most systems worldwide that are connected to a network usually access a ntp server - just think of the huge ntp.org time server pools, clocks should be somewhat in sync. The checking just needs to add a +/- time frame to compensate for clocks keeping bad time and some transmission delay, something between 4 to 8 seconds probably. If a time stamp of higher precision is available, it could help to identify the correct order of the packets to prevent fast replay attacks - assuming that packets do not overtake each other on the line.

Alternatives 1. and 2. are sequential, meaning that they repeat in the same order starting from some point, i.e. the first packet after (re-)initialization gets the same counter or PRNGed number.

Alternatives 1. and 3. clearly run counter to the randomness of the IV - it shall be as unpredictable as possible, especially in our case that has a highly predictable first cipher block. This could be solved by mixing the bits of the counter or the time stamp somewhat randomly but reversibly in the IV. How could that be achieved any better than using the encryption algorithm already at hand? Thus, a possible scheme would be:

Generate Random IV of 128 bit (minus time stamp size) block size
Fill it up with anti-replay-part, maybe the time stamp, at the end to fill up full 128 bit block size
Encrypt this block using K1
Use the result as IV for cbc mode encryption of the data
Encrypt this block using K2 (as already suggested in IV implementation of #76)
Store this doubly encrypted block in front of the encrypted package (as already suggested in IV implementation of #76)

The decryption scheme could remain the same as already suggested in #76 because it only requires the decryption of IV using K2 and using it for cbc mode decryption.

The checking would require the additional decryption using K1 and extraction from the other random part of the IV.

It is important to use K1 in the first step, K2 must not be used as a double encryption using K2 might be close to en- and de-cryption.

Avoiding the sequentiality of alternatives 1. and 2., alternative 3. using the outlined scheme above could be the right thing to implement and upgrade the AES version to 3. It would remain compatible to the suggested version 2 by just not performing replay checks for version 2 packets; version 2 would still be able to digest version 3 packets properly.

Logan007 commented 5 years ago

Taken care of in current suggestion for #76.

emanuele-f commented 5 years ago

Having an encrypted 128 bit preamble to the packet is mandatory here right? We need to protect the integrity of the counter/timestamp and we can do it via encryption right now with AES ECB which requires 128 bits. Right now we have a preamble of 8 bits SA + 32 bits nonce (unused) + 64 bits iv seed = 104 bits, however that could be reduced to 64 bits by removing the SA and nonce, so we should think carefully about expanding to 128 bits. Using a timestamp with a tolerance window has the advantage of working fine if an edge node is restarted. However, the requirement to sync the clocks is something I don't like very much for this simpler implementation. Some links to the replay topic:

Brief intro to OpenVPN replay protection - https://openvpn.net/archive/openvpn-users/2005-11/msg00073.html
No replay protection, 0-rtt and the browsers - https://blog.cloudflare.com/introducing-0-rtt
Wireguard implementation (section 5.1) - https://www.wireguard.com/papers/wireguard.pdf

Logan007 commented 5 years ago

OpenVPN and Wiregueard implementation seem to use a utc time stamp, some add a counter for refinement. I think, ntp is around everywhere. Even windows syncs with internet time. So I would not be against time stamps and every router acts as ntp server. And even in case of interrupted internet connection, ntpd keeps good track of wrong clocks using a driftfile.

The problem with "counters only" (not using time stamps) is that link interruptions might cause huge jumps in numbers - just think of a "ping -f". What would be the limit of acceptable jumps? In that respect, time frames are much easier to evaluate, especially if we only need to deal with time differences.

And also, I can't image to have connected computers' clocks being apart more than the suggested 32 seconds. I know, RasPi usually does not come with a RTC.

I am not sure if you want to add a replay protection feature in aes version 1 or in a more secure version 2. If it were for version 2, the user could decide weather she is capable of syncing clocks (if even required) before starting encrypted connections.

Concerning block size: Maybe we find a FF1 implementation for smaller block sizes? But honestly, the additional cost of 8 more bytes for the combination of IV and replay protection does not seem to hurt to much, does it? We could save maybe 4 bytes using FF1, but we would need to implement it. Is it worth it? On the other hand, these 128 bit could be used to also transport some additional information, e.g. for Diffie-Hellmann key generation.

I am still extremely convinced by the IV solution and replay protection provided in #76 :bowtie:

emanuele-f commented 5 years ago

Replay protection would be good to have in v1 version too in order to cover the use case of n2n in a home automation use case. A simplistic application may not implement replay protection properly so I think n2n here should do a good job to prevent replay protection and avoid, for example, to use a replay attack to switch iot things on and off. In my opinion n2n should use replay protection by default but permit to disable it via a cli option.

I would rethink v2 version of aes in the use case of enhancing security in very hostile environments (maybe even by encrypting the n2n header sent to the supernode to obfuscate n2n and prevent blocking).

Would you like to implement replay prevention in a separate pull request? Please consider:

The 2038 problem as discussed
Avoid RAND_bytes when possible to reduce the performance impact

Logan007 commented 5 years ago

I would love to!

I agree that rand should suffice for AES encryption scheme version 1.

Other than that, the 2038 problem could be addressed by

ignoring it as a simple restart of the edes will do,
casting signed int to unsigned int and this doubling the epoch time, or
something else - unfortunately native 64 bit time stamps do not seem to be supported on every 32 bit linux, so making it bullet proof today for all platforms by the means available right now would probably include the implementation of manual calculation of a 64 (or less) bit value from available distinct date and time fields (yy,mm,dd,hh,mm,ss, ms, us, ...) for every packet; with a view to performance, I would not recommend that.

Shall we first go for a FF1 implementation to keep the combined IV and replay protective fields as short as possible? On the other hand, that would enormously increase code complexity. I wish, openssh already had it implemented...

emanuele-f commented 5 years ago

I would avoid the use of FF1 because of the reasons you have highlighted. The choice to add another 8 bytes overhead is of course not that easy to accept but I think this feature is important. Moreover, I think the n2n common packet headers (used by both AES and twofish) can be further optimized so we can save some space there (at the cost of compatibility break of course).

For the 64 bytes timestamp support, if the OS does not support it on the platform there isn't much to do on that systems, but on supported systems I think we should use them. I think the timestamp should become part of the IV together with the random value to provide protection against malleability.

Logan007 commented 5 years ago

So I would basically plan to stick to replay protection scheme from #76:

generate some random parts using rand,
taking a time stamp, and
maybe leaving some reserved space (16 bit or so) for future use, e.g. DH key generation - as long as not used yet, 15 of those bits could stay random.

This all will be fitted somehow in a 128 bit block and encrypted in ECB mode using the regular payload key. Why the regular key? Because the next step uses the secondary key, we do not want to apply multiple encryption steps directly after another using the same key.

The steps up to here could be iffed out if cli option for "no replay protection" is given. In that case, IV would need to be prepared just as 128 bit random value. I would stick to 128 bit then anyway skipping the random padding scheme which takes the last quarter of the SHA512ed key. We might be able to go down to SHA384 then, but thoughts on that in the context of entropy to follow.

Why should we stick to 128 bit IV size in that case? For compatibility reasons. If we want to be able to disable replay prevention by cli parameter for some edges in the network maybe due to some foreseeable clock issues, we still could allow for mixed environments if the packets still follow the same format. We just need to keep track in the host keyed sa-structure if the feature is supported by the other nodes - I guess it should be detectable. If both nodes of the current communication support it (AND), it could be used. For example, if we have A and B supporting replay protection and C and D not, communication between A and B would still allow for replay protection. The others (C,D) just do not need to perform the other decryption step for to get the time stamp.

That IV gets encrypted again before use following the already proposed and implemented IV encryption scheme relying on AES128 under the IV encryption key. Or, we decide to encrypt it with AES128 before transmission; it would not make any difference now that we are using a full block size IV.

Any thoughts?

emanuele-f commented 5 years ago

Please check out some ideas in https://github.com/ntop/n2n/issues/96 to store the peers information such us the timestamp support of a specific edge. I agree that edge with and without replay protection should be able to talk each other. Regarding the encryption, I would avoid using the primary key for the ecb encryption, here is what the sender should do in my opinion:

Get the current timestamp and use it along some random data to fill a 128bit block
Use the block as IV in cbc
Encrypt the block with the secondary key and send it

The receiver:

Decrypt the block with the secondary key
Perform the anti-replay check on the timestamp
Use the decrypted block as the IV for cbc

Logan007 commented 5 years ago

Unfortunately, the time stamp is quite predictable and as such should be avoided to be directly used as IV. That is why I introduced an encryption step (between 1 and 2), it is more for scrambling purpose. We can not use the secondary key. Either we use the payload key (which indeed is a bad practice, I agree) or some third key, maybe the last quarter of SHA512ed key; or some other scrambling - it just has to be reversable to get the time stamp.

I will check #96 later today / tomorrow.

emanuele-f commented 5 years ago

So maybe we can scramble some part of it via xor (better to do this without performing a loop), we still have many bits available from SHA512, I would avoid another encryption step which will have a performance impact. We have to find a very light way to do this.

Logan007 commented 5 years ago

I would not recommend to XOR the quite predictable time stamp with a constant at all, as the resulting IV also gets then XORed directly to quite predictable (and often constant) data of the first block which then makes the input of the first payload encryption step quite predictable and thus easing known plaintext attacks...

The scrambling needs to be

key-dependent because it would be useless otherwise ,
reversible,
"scrambly" enough to not allow conclusions on the key or the plain text (output about randomly and "far away" from input).

Maybe we find a way of applying some "reduced rounds"-AES, I will need to dig a bit.

emanuele-f commented 5 years ago

Let's consider another option. If we implement authentication we can avoid using the timestamp as part of the IV and leave the IV as it is now. Packet size would be increased as follows:

4 bytes timestamp (seconds, we have to handle 2038 some way)
4 bytes MAC
2 bytes length

So with 2 extra bytes (comparing this to the current proposal of extending IV to full block size) we would also get authentication, and the additional computation would be used to generate the MAC.

Logan007 commented 5 years ago

I will think about it... I am not sure why we should not merge time stamp and IV? That would give more safety to the cbc payload encryption. Is it to avoid the supplementary single block encryption step? Also, is a second-precision timestamp good enough for replay protection?

emanuele-f commented 5 years ago

Yes, to avoid the extra encryption step, but if we think how to feed it into the IV it would be better. If we plan to use a satety margin of some seconds to prevent replay than I see no advantage in using a more precise timestamp.

Logan007 commented 5 years ago

I would be in fond of moving the time stamp into the IV; as we need to transmit it anyway, we could use it to increase IV's entropy.

Concerning time stamp precision: There are two steps in replay prevention. The first step compares the sent time stamp with the receiver's clock. This step does not need much accuracy, +/- 16 seconds or so. The second step however needs more precision: It compares the time of the currently received packet (from the time stamp) to the previous received packet's timestamp (stored in sa-structure). This should be as precise as possible but still could allow for overtaking packets (in my setup +/- 160 ms delivers good results, below that value too many packets get discarded). For the last step, every packet needs to carry a time stamp that is greater or equal than the stamp of the previous packet - time can be considered a kind of (universal) counter...

After processing a received packet, its time stamp needs to be stored as "last-packets-timestamp" in host-keyed structure (initialize to 0).

With a view to 2038, we should use 32 bit unix time stamp for now and reserve one or two bits more for later use - implementation for those reserved bits must wait until 64 bit system time is supported on all systems (still seems to be work in progress especially on 32 bit linux).

emanuele-f commented 5 years ago

Why do we need both steps?

Logan007 commented 5 years ago

Te first step ensures that the packet is roughly in the same time frame. Otherwise, an attacker could just replay a completely captured packet sequence from the past when a node gets restarted. As this compares the sender's clock put into the timestamp to the receiver's clock, there is a lot of tolerance (+/- 16 seconds = 32 seconds as a first suggestion which already is a lot in the world of regular ntp updates and driftfiles - my computers in different part of the worlds have regular ntp running and their clocks are not notably apart, we could think of much less tolerance reduced to the maximum allowable packet transportation time, e.g. 4 to 8 seconds).

The second step prevents "quick replay attacks" by using the time stamp as a counter. If we skipped this second step, one could just resend packets with the same low precision time stamp again - a low precision time stamp would require to compare "geater or equal" to the previous one. We should not use a real counter only as tolerance for faulty lines etc. would be harder to implement; it is easier to use time as a counter and additionally make sure times are roughly the same (see first step above). Although, we could save some bits if we decided for a counter that counts the packets under the same low precision timestamp and gets reset on the sender side with every changing low precision timestamp, i.e. every new second. But how many bits would that require, i.e. how many packets per second should be the limit?

However, the replay prevention information to be effective may not be altered, so we either need to encrypt or authenticate it. When used as part of IV they should be "reversibly" scrambled by using the key, otherwise the IV contains parts that are too predictable, see the posts above.

Altogether, the suggested approach combines the advantages of time-based and counter-style IV at the comfort of not needing to deal with additional counter logic and tolerance, but just time.

By the way, to make this approach bullet-proof, we would need to make sure that the re-setup of a connection takes at least the "tolerance time" (the 32 second time span mentioned above). Why? To prevent the capture of the first packets, interrupting the connection and then resending.

I would also like to refer to a previous comment and this review discussion.

Logan007 commented 5 years ago

May I suggest to implement it in a "parametrizable" way, i.e. let some parameter or constant define the level of accuracy and thus length of time stamp / IV information? That way, every compiler-savy user could re-define the security level according to her needs.

OpenSSL does not seem to support a fewer-rounds-Rijndael, so we might need to stick to AES-128 for scrambling. We probably do not want to add another crypto lib.

Logan007 commented 5 years ago

Good news: It seems that we will see 64 bit times very soon. Just give it a little bit of time and then make kernel 5.1 a minimum requirement. Until then (way before 2038), we could use 32 bit time functions but already reserve the necessary bits in the data to be transmitted.

ntop / n2n

Add protection against replay attacks #73