Closed creker closed 4 years ago
/cc @agl
agl's previously written about this: https://www.imperialviolet.org/2014/06/27/streamingencryption.html.
AEAD
Seal(dst, nonce, plaintext, additionalData []byte) []byte Seal(dst, nonce, plaintext, additionalData []byte) []byte Open(dst, nonce, ciphertext, additionalData []byte) ([]byte, error) Open(dst, nonce, ciphertext, additionalData []byte) ([]byte, error)
@balasanjay I don't think small propability of people misusing API is a justification to artificial limitations. There's always a chance of misuse with crypto. In fact, there's a much bigger chance with nonce provided by the developer, given that GCM is very strict about choosing it. The reasoning behing API design looks more like a personal opinion without clear justification (and I don't think there could be any). The blog post pretty much says that.
With current API you have no other choice but to ditch standard library and use external library to have access to something that should be there in the first place. That doesn't look good for standart library and Go 2 is a good oportunity to fix that. Everything else in the crypto library looks good and does everything you need it to do. AEAD
interface always stood out for me as being the "special one" for no reason.
And I don't propose to remove Seal
and Open
. They're handy and easy to use. Documentation can even say that programmer should use them in every possible case and resort to block mode of operation only if he really needs to.
Just to be clear, I wasn't actually saying anything myself, I was just pointing out that the actual decider (agl) has written about this matter previously, and included a sketch of a design with 16kb chunking and multiple implicit tags to avoid adding modern crypto APIs that return unauthenticated plaintext.
A scatter/gather interface (which allows one-shot operation, but where the expansion due to encryption can be written/read from a different buffer) is a reasonable extension to the AEAD interface that Go doesn't currently have. That allows, for example, in-place encryption while storing the expansion data (i.e. the tag in AES-GCM) into a separate buffer.
(For example, see this API.)
But a fully general interface is dangerous for the reasons referenced above and we don't want that. Code can always do it by avoiding the standard library, but it's ok if the standard library is constraining when you're going off the rails.
I understand the danger. I just don't see it to be enough to constraint the standard library in such a way that it's incompatible with other implementations. This is the only general crypto library where I've seen this done without any alternative.
Look at things like MD5 and DES. Going by the same logic, they shouldn't be in the library. They're not only dangerous but completely broken at this point. But there're implemented and have a comment saying that you shouldn't use them. And I don't see the problem here. Even though there're broken, it's still necessary to provide them for compatibility with other software. The same reasoning applies here - AEAD could be misused but you can always mention that in the documentation. Keep the existing interface for safe and easy tasks and provide another for users who know what they're doing.
As for scatter/gather, that's not the point of my proposal but it's at least something if fully general interface is rejected.
@creker
I understand the danger. I just don't see it to be enough to constraint the standard library in such a way that it's incompatible with other implementations. This is the only general crypto library where I've seen this done without any alternative.
If you look for example at the AWS SDKs you can see that they have the same problem described in the issue. Their SDKs let the user process data before the auth. tag is verified. In fact the only SDK which avoids this pitfall most users are not aware of is the go-sdk. (as far as I know)
Look at things like MD5 and DES. Going by the same logic, they shouldn't be in the library. They're not only dangerous but completely broken at this point.
These algorithms are there for legacy and backward-compatibility reasons. Removing them would be a breaking change. Actually they should have been implemented in x/crypto
- not in the standard library. Roughly speaking: "One mistake does not justify another".
Anyway as mention in Adam's blog post it would only be a valid approach for very few use-cases. IMO it's not the responsibility of the standard library to provide a solution for any use case but it should focus on the common and recommended cases. If you're trying to use an AEAD in streaming mode you should consider an approach like sio
@aead
In fact the only SDK which avoids this pitfall most users are not aware of is the go-sdk. (as far as I know)
I see that and don't view it as an advantage. It's not a kids playground, it's a crypto library. I don't think one should constraint API to such an extent just so that some programmers that don't understand how authenticated ciphers work (like, really don't understand. That's basic stuff) do not make a mistake. If that's the goal here then you really shouldn't be able to choose nonce yourself. That's even more subtle and there're an actual recommendation from NIST for that, unlike the issue here.
One mistake does not justify another
As I said in my post, I don't see it as a mistake. There's a reason for it - compatibility. Being able to use it to communicate with other software. Maybe not exactly like it but the issue is similar to that. Like, you have already implemented protocol that uses GCM for perfectly secure encryption of large payloads but suddenly Go says "no, you're not allowed to do that because somebody somewhere might make a dumb mistake. Go use external C library and mess your project by introducing CGO into it".
the common and recommended cases.
File encryption is common case for AES GCM and I don't see anywhere where GCM is explicitly recommended to be used for small payloads. Again, that looks like a personal opinion to me that for some reason got into general crypto library. There was a mention of BoringSSL - that's exactly where that kind of constraint should be. In application specific internal library and BoringSSL is exactly that.
I see that and don't view it as an advantage. It's not a kids playground, it's a crypto library. I don't think one should constraint API to such an extent just so that some programmers that don't understand how authenticated ciphers work (like, really don't understand. That's basic stuff) do not make a mistake.
About the AWS example: Users use the SDK API without choosing a nonce - the SDK does that (and users may also not be aware of the underlying cipher at all). It's really a high level API. Nevertheless the "bad" practice of returning unauthenticated data shines through the API and may hurt users which just use a high-level API and are not familiar with cryptography at all.
If that's the goal here then you really shouldn't be able to choose nonce yourself. That's even more subtle and there're an actual recommendation from NIST for that, unlike the issue here.
That's a conceptual requirement. If the nonce would be chosen randomly by e.g. rand.Reader
you would not be able to decrypt. I see that there is the possibility of misuse but I don't see a proper solution for this.
There's a reason for it - compatibility.
Well that's a tradeoff. API flexibility vs. misuse resistance. Since I see only very few legitimate uses cases of an AEAD streaming approach I think misuse resistance outweigh API flexibility in this case.
I don't think it's an opinion - it's a design choice.
File encryption is common case for AES GCM and I don't see anywhere where GCM is explicitly recommended to be used for small payloads
As I tried to show with the AWS example it's not that simple. AWS CSE is basically file encryption (objects are just blobs - like files) - and they implemented it in - at least - a dangerous way. As recommended before you can split files into smaller chunks and encrypted them separately instead of the whole file. Further there's the 64 GB bound for AES-GCM - what if files are larger than 64 GB?
Again, I don't see why it's Go's responsibility to support every use case - especially if support exposes an API which leads to bad practice - even tough there are a few legitimate cases.
Anyway I cannot make any decision about this - That's @agl 's responsibility.
/cc @FiloSottile
As a standard library for developers, not cryptographers, I think we have a clear and well-justified (if not perfectly implemented) policy of preventing mistakes that have been commonplace in other ecosystems. I don't think that's up for discussion.
However of course compatibility involves a constant tradeoff, so I'd like to understand how common these use cases are. You mentioned some protocols securely using GCM over large payloads, can you make the example(s) more specific?
About scatter/gather, not opposed, but I'm not sure I see the application well. Most times in Go I see buffers with capacity including Overhead being reused/pooled, and sliced down when not holding ciphertext.
Sorry for leaving this for so long, but based on the discussion above this sounds like a likely decline.
Leaving open for a week for final comments.
No change in consensus, so declined.
Currently
AEAD
interface provides only two functionsThat works for network protocols where messages are usually small. That may work for small files but that doesn't work for larger payloads. GCM provides enough security to use its tag to authenticate large payloads that you wouldn't want to (or couldn't, even) store in memory in its entirety. With the current implementation there's no other way.
I propose to change existing
AEAD
interface or add another one to give the ability to encrypt/decrypt payload block by block. When finished it would produce authentication tag that you may use as you see fit. The tag could be stored at the end of the cipher text implicitly but I don't really like the idea of implementation forcing me to something that isn't required.All implementations that I'm familiar with and has used myself divide into three steps:
The main difference is in finalization:
All of these implementations fit the proposal. My only preference is that the tag should be separate from the payload. The implementation shouldn't write or read it implicitly. Programmer should have the access and freedom to use it as needs to. OpenSSL and mbedtls both implement it that way.