real-logic / aeron

Efficient reliable UDP unicast, UDP multicast, and IPC message transport
Apache License 2.0
7.37k stars 888 forks source link

Encryption/Security Discussion #203

Closed tmontgomery closed 4 years ago

tmontgomery commented 8 years ago

A place to discuss initial encryption/security concerns.

tmontgomery commented 8 years ago
sergiotudela commented 8 years ago

Hi,

Are you considering ECC based on CurveCP and NaCl?

phaynes commented 8 years ago

Our specific use case is to transmit > 10^9 4K events per day from a real-time security decisioning system via multi-cast within a data centre. Approx 12 attributes within events contain PII data (e.g. email and IP addresses) that security folk require encryption over the network. Noting that I am not a security person, I don't think we particularly need authentication / reputation and we have an existing key server when the system recovers - but maybe I am wrong. It is important that it is a recognised high grade industry security standard so the approach can be explained to others. My current SLA is 5ms for initial message processing and I can probably negotiate another 1ms for security. In practice this means I have about 50-100 microseconds on average to encrypt / decrypt a message - a stable and small latency tail would help here.

We use both the C++ and Java versions of Aeron so one solution that works with both would be ideal but not essential. A C++ / JNI version to start?

AES-256 has x86 instruction support so my un-measured guess is that this could be useful. The OpenSSL AES implementation is mainly ASM, but it was written in 2000 and at first glance it doesn't look like it particularly takes advantage of hardware aware optimisations.

Performance concerns include startup time of the encryption, block size, feedback loops on the data that will impact latency ala CBC, the ability to use multiple cores to encrypt / decrypt.

I was thinking of starting by doing some basic performance characterisation and engaging with those more knowledgeable on these matters.

phaynes commented 8 years ago

sergiotudela - I have no specific preconceptions. Just something that works and I can sell to the security folk without too much effort.

phaynes commented 8 years ago

Some additional non-technical requirements:

  1. After dutifully impressing the technical executive of a major bank about Aeron's performance, when asked by the CSO on the encryption approach used the smiles continue.
  2. A technical demonstration of Aeron is given at DEFCON and members of the conference are unable to decrypt network traffic (even if they can cause some disruption to operation).
  3. Reviewing the Aeron approach to encryption with cryptographers within the "Five Eyes" community, it is clear that they are irritated that they can't trivially gain access to PII data being transmitted.
phaynes commented 8 years ago

Todd suggests DTLS. As an option this has the advantage of being added to JDK 9. http://openjdk.java.net/jeps/219

tmontgomery commented 8 years ago

@phaynes does having the cleartext in shared memory pose an issue?

phaynes commented 8 years ago

This is going to depend on your security use case. We have multiple secure of data centres in different juristictions and our reading that for PII purposes is that clear text in SHM is OK a) as long as the data is transitory and b) doesn't get persisted to the new persistent SHM (usual legal caveats apply). In gaming or high end intelligence use cases they would insist the SHM was encrypted also.

tmontgomery commented 8 years ago

@phaynes what is the definition of "transitory"? Would laying in a logbuffer for an arbitrary time period violate that?

phaynes commented 8 years ago

A definition of transitory - this has been quite the subject of debate. When do you consider RAM the new disk and in what context do you define "arbitrary time period"? For example, our system includes a facility for guaranteed message delivery (GMD). In the normal case, messages containing clear text PII come in, they are stored in replica across multiple machines in SHM and then deleted - usually within a couple of milliseconds. Rightly or wrongly, this is being deemed transitory.

However, when services around the GMD facility become unavailable, PII data remains available in RAM that could theoretically be viewed by a rouge sys admin until external services recover. This is being view as OK, since it is relatively rare event and the consequences of losing messages is worse than having clear text PII siting in SHM. It just means additional operational controls are required should this transpire.

Now some of the PII data may be selected for further processing and the TTL of this is up to 7 days. This is being considered as no longer transitory since it falls outside of the normal expectation of how the data will be used and by design large quantities of PII data could be available for an extended periods. So here we encrypt, hash or delete PII.

Thus our definition of "transitory" is a couple of ms in the normal case, longer in case of failure, but by design in normal operation up to 7 days - no.

Anyway, this is our interpretation others please feel free welcome to agree or disagree.

tmontgomery commented 8 years ago

Data in a logbuffer isn't "cleared"/zeroed until more data pushes it out. So, if a system stops, the data is still accessible. This would seem to be marginal for your use case.

phaynes commented 8 years ago

Yes. Trivial amounts of data in the situation a system designed not to stop, stops.

tmontgomery commented 8 years ago

If that is OK, then, it would seem that an attack vector of the logbuffer contents is not an issue. So, something like DTLS might be an option for unicast. For multicast, DTLS has some extensions, but support is not apparent.

phaynes commented 8 years ago

Reviewing Australian security classifications, it is system availability that is key, rather than small amounts of data of that would not even get a protected security rating. Multicast is a key feature and if DTLS is to be used, we would have to make the support apparent.

phaynes commented 8 years ago

Poking through DTLS and TLS standards + a bunch of others such as the expired https://datatracker.ietf.org/doc/draft-keoh-dice-multicast-security/. It seems DTLS puts requirements onto the protocol such as having to have records fit within a single datagram and assumes records may be re-ordered. With Aeron these assumptions don't have to be true and perhaps something like a SecureFragmentAssembler is implemented with other messages for key exchange. Obviously something like this would not deliver optimal performance but would potentially be far simpler to implement.

tmontgomery commented 8 years ago

Doing the codec at the app level instead is actually easier. And doing it as a pipeline is probably much faster as well. I do have some concerns about replay attacks, though. And I was doing some research yesterday to see if it opened any obvious side channel attacks.

phaynes commented 8 years ago

I think it is going to depend on the scope of the work. At the protocol level you are going to be able to harden the system against Byzantine scenarios and the like - but if all you want to do is keep a secret, then is the app tier sufficient? I am pinging some others about consequences of encrypting at the protocol versus app level.

tmontgomery commented 8 years ago

At the app tier, there is no protection on the Aeron headers. Which contain offset, term Id, etc. So, some information is available. But the data is mostly secure. My biggest concern is an attack utilizing NAKs for replay and using that information. I haven't been able to definitively determine there is an attack that way, though. It's somewhat like having an encrypted file, you can look at any part of it over and over, but it doesn't mean you can determine anything about how it was encrypted by doing so (in a general sense).

phaynes commented 8 years ago

Obviously if the header information is public it easier to disrupt network operation, execute MITM attacks and using message size and meta-information to determine the cypher (although this seems a long bow). That stated, scanning the secure multi-cast comms literature my concern is that going down the protocol level route is that it opens up bunch new research problems. Further, my guess is that the primary use of Aeron will be within a secure datacenter and many of these attacks will probably not apply. Here application level encryption will be more than adequate for use cases like ensuring different trading desks don't spy on each other and the like. But maybe going down the protocol route is simpler than I guess.

phaynes commented 8 years ago

Also I did some numpty performance tests to see ascertain approx performance. For AES 128 on JDK (which uses x86 instructions) I am getting ~.25M encrypts per sec and similar for decrypting 64 byte messages. For AES 256, depending on parameters set I saw approx 50K per second (or much worse). Although these numbers could certainly be optimised, it is already making me keen to separate secure and clear text traffic down different channels. If AES is to be used - I am already thinking getting waivers just for AES 128.

tmontgomery commented 8 years ago

@phaynes you might check out the Intel AES instructions https://software.intel.com/en-us/articles/intel-advanced-encryption-standard-aes-instructions-set directly instead of going through the JDK. An (en|de)cryption pipeline using logbuffers should be able to leverage them directly.

phaynes commented 8 years ago

So you think a JNI approach should be used for Aeron java? I will look at the instruction set directly - at this stage I am just getting indicative numbers to inform the design / approach.

tmontgomery commented 8 years ago

JNI, not really. I am thinking a pipeline Java API -> logbuffer -> encrypter (in C/C++) -> logbuffer -> driver. A single thread used for encrypting all buffers.

phaynes commented 8 years ago

OK - after a bit of fiddling to get a SDC linux VM that supports AES instructions (KVM doesn't work), I got some high level numbers from the Intel_AESNI_Sample_Library_v1.2 with much better results than Java. The following table show performance numbers from test results which no doubt could be improved further based on some of the optimisations mentioned in the paper above. I have simulated 64 byte and 1024 byte messages. Tests were run on a CPU E5-2660 Xeon. Both implementation's would need tightening and support for cross platform.

messages-per-sec

aes-results.txt

tmontgomery commented 8 years ago

Those numbers look good. Respectable.

One of the advantages of an encrypter/decrypter that is native, taking in a logbuffer on one side and offering to another on the other side is that it isolates the platform specific piece into a very localized area. So, porting should be quite easy.

phaynes commented 8 years ago

I agree a encrypter/decrypter filter process has a number of advantages not least of which is the addition of a security module remains the responsibility of the operations team. Security updates can be performed using existing IT processes. An application level design, would inevitably require application re-deployments with a cycle time likely to fall outside the timings needed to correct a vulnerability in a hurry.

On the negative, I see two issues:

  1. A separate process leaves the system open to side channel attacks with data passing in clear text through a conveniently located place in SHM.
  2. Sociability. Deploying Aeron into our data centre we had to deal with multiple complaints of the integration chewing through CPU resources simply to meet latency concerns. This was almost a show stopper for Aeron. A separate process would exacerbate this.

Thus my thoughts for the morning were for a native encrypter / decrypter module that is dynamically loaded by the Aeron client.

Thoughts?

tmontgomery commented 8 years ago

It is a set of tradeoffs.

A separate process is only mildly less secure than another process in terms of the cleartext being in another logbuffer in shm as an attack vector.

A native module that could use a private memory-based logbuffer for encryption/decryption would be slightly more secure.

phaynes commented 8 years ago

Continuing to do initial research and started developing a module design that can be reviewed. It has occurred to me there are two main effort vectors - symmetric encryption and asymmetric multi-cast key exchange.

On the first front, we are starting work on a POC for AES 256 symmetric encryption. For a variety of reasons we going down a cross platform JNI route with a separate C++ executable. In this way either a standalone or loadable module can be used. Once basic symmetric encryption works, (rightly or wrongly) my thought a goal could be a version of a library that would work with an enterprise key store as a first increment.

RE: Public Key Exchange. There is obviously quite the literature here that I have been making my way through to better engage people who have built production key exchanges, security modules as well as cryptographers. I will detail thoughts in a design doc.

Any preference for design publication? Does this seem an acceptable way to tackle the problem?

tmontgomery commented 8 years ago

@phaynes sounds good to me. Logical Key Hierarchies (LKH) work fairly well for key distribution/management in large multicast fan out. Rekeying can be done with a single message also. Revocation is also possible, although, tricky to get right.

I've thought that the elegance of EC keying with LKH basics is an untapped way to dramatically improve key management in distributed apps.

magro commented 7 years ago

What's the current status here, do you plan to work on this within the next months?

mjpt777 commented 7 years ago

No immediate plans. It is in discussion and might happen towards the end of the year.

magro commented 7 years ago

Ok, thanks!

steveturner commented 6 years ago

My apologies that I'm coming completely out of no where on this, but the discussion on EC and LKH keying for multicast security with Aeron/Akka sounds particularly interesting. Has anyone worked any prototypes in this area yet?

tmontgomery commented 6 years ago

We will be working on this right after the new year. The basic infrastructure, that is. i.e. encryption/decryption and a framework for various cyphers. Doing something with EC and LKH could follow on from that if there is interest.

luengnat commented 6 years ago

@tmontgomery anything planned out for this area?

tmontgomery commented 6 years ago

@luengnat still on the list to do as mentioned. Clustering took precedence, but it is stabilizing quite nicely.

QIvan commented 5 years ago

Hi! Is this task still in the todo list? Thanks.

tmontgomery commented 5 years ago

Indeed it is still on the list. And is due up pretty soon, we hope.

kKdH commented 4 years ago

Hey guys, are there any news on this?

mjpt777 commented 4 years ago

@kKdH We are just completing the work and it will be available from Aeron 1.30.0 as a premium feature. We are very happy with the performance and believe it sets a new standard in what is possible. Premium features are available on commercial terms to our support customers.