Write a blog post about our API and the challenges we faced

defuse commented 8 years ago

Maybe we should write a paper, similar to the NaCl paper, about our API. We found a way to make file encryption usable so I think it's worth it. It would document our design choices and perhaps make a good argument for libsodium to support the same API for encrypting/decrypting files.

paragonie-scott commented 8 years ago

That's basically what I was thinking about working towards when I opened #158. :+1:

defuse commented 8 years ago

Typing up a publishable-quality LaTeX paper is too much work for me right now so I'll just blog about it on bqp.io.

defuse commented 8 years ago

Outline:

Me and Scott wrote a usable PHP library for symmetric encryption.
It's valuable because it's a working production-quality library, but we also learned a lot about secure usable API design. This post is about the usable API design.
What existed already?
- libsodium (based on NaCl).
- crypto_secretbox_easy (show its API signature and describe how to use it).
- crypto_secretbox_open_easy (same).
- Downside: Requires caller to pick the nonce and store it themselves, they have a ticket about this: https://github.com/jedisct1/libsodium/issues/392
- Upside: Useful for higher-level protocols that use the nonce to "position" a message within a protocol or give it purpose (quote DJB's paper on this).
What we did
- Choose and store the nonce inside the library.
- For most PHP situations this is all you need (never need that second use of a nonce for positioning within a protocol).
Something better
- Exposing the nonce so that it can be used to position messages within a protocol is really a mistake, it's conflating two separate issues into the same parameter. It should be a separate parameter, called the positioning or purpose (or something like that), and the nonce should be chosen alternatively.
- Why? Because for any protocol that exclusively uses such a primitive, it's much easier to reason that a compromise of plaintext contents implies a compromise of the key. With the parameters separated, screwing it up just means you can reorder/replay messages which likely won't break confidentiality. With the same parameter reused for both purposes, screwing it up means you can reorder/replay messages and you can probably decrypt them.
- Give as many examples as possible of attacks that would have been prevented by this (better if they're against TLS or SSH).
- Downside: Performance. But: Who cares? It's ~1 extra hash so let's just make shit secure.
- Global domain separation (especially where passwords are used as keys).
- Necessary for non-malleability when a password is the only secret (even if you salt internally entire ciphertexts from different protocols can be replaced).
Bonus: Issues surrounding file encryption/decryption.
- Encrypting in streaming fashion is easy (but MAC has to be at the end).
- Decrypting presents challenges.
- Our two-pass solution (problem: still leaves truncated plaintexts a possibility).
- To really solve it you have to do the decryption somewhere the caller can't access (or doesn't know where it is), e.g. in memory or in a tmp file.
Bonus 2: Use HKDF and a salt to get rid of length / encryption count limits.

defuse commented 8 years ago

What's the required API for the CAESAR competition? Does it have that position/purpose parameter that I want?

defuse commented 8 years ago

Here's the requirements for CAESAR: "An authenticated cipher is a function with five byte-string inputs and one byte-string output. The five inputs are a variable-length plaintext, variable-length associated data, a fixed-length secret message number, a fixed-length public message number, and a fixed-length key. The output is a variable-length ciphertext." I suspect I'm re-inveng something that's already been widely discussed here.

defuse commented 8 years ago

Nope, I'm not re-inventing something. CAESAR ciphers only have to have one of the two (public or secret) message numbers. In fact NORX only has one nonce parameter, which does conflate the two different uses. :(

defuse commented 8 years ago

Given the CAESAR requirements on the message number parameters, one could use the first half of NORX's nonce randomly and the second as the purpose/position, but it isn't long enough for this.

defuse commented 8 years ago

The ultimate symmetric encryption API:

On input (appseed[arbitrary], key[32], messagenumber[16], plaintext[arbitrary], ad[arbitrary]):
    salt := random(32);
    (akey[32], ekey[32]) := BLAKE2b(
        key=key,
        salt=salt,
        personalbytes=messagenumber,
        message=appseed,
        output_len=64
    );
    ciphertext = SomeFast256BitAEADWithoutInternalKDF(
        authkey=akey,
        enckey=ekey, 
        ad=salt || ad,
        message=ptxt,
        nonce=zero
    );
    return salt || ciphertext.

Each application/protocol has its own app seed which is unique to it and provides domain separation (which is arbitrary length to allow as many levels of namespacing as you'd like). The key is a secret for this session. The messagenumber parameter can be used to "position" a message within a protocol or give it a "purpose" (within this session within this appseed). The plaintext parameter is the message to be encrypted. The ad parameter is additional data.

defuse commented 8 years ago

I implemented something like the above using ChaCha20 for encryption and BLAKE2 as a MAC with libsodium: https://github.com/defuse/dfcrypt/blob/master/dfcrypt.c#L35

defuse commented 8 years ago

This should be two separate blog posts:

What's the best API for symmetric encryption and decryption?
Challenges implementing streaming file decryption.

defuse commented 7 years ago

I had some thoughts about formally modeling API security usability. I'm posting them here because the ideas aren't finished enough for a blog post on its own, and it's relevant.

Let Σ be an alphabet and let P ⊆ Σ* be a language of programs. Let c : P → ℕ be some measure of program complexity (e.g. lines of code, characters, etc.). Let A ⊆ P be an "application", i.e. a subset of programs that would pass some basic acceptance criteria for implementing some human-meaningful program (e.g. unit tests, a human's opinion, etc.). The programs in A may be incorrect in subtle ways and contain security problems. Now let A_S ⊆ A be the even smaller subset of programs that are actually secure (but may be incorrect in benign ways).

Definition: A programming language P has strong secure usability for an application A ⊆ P if for all a∈A, either a ∈ A_S or c(a) >= c(b) for all b ∈ A_S. Informally, insecure programs that pass the basic acceptance criteria are all more complicated than secure programs.

I bet "strong secure usability" is unattainable (try proving this!) for most (P, A) pairs so we need a weaker definition that replaces "for all b" with some kind of "for most b." I'm not sure of the best way to do that.

The complexity function c should be easily computable from the program (just count the lines or number of characters or whatever) and testing membership in A can also be automated. What we can't do is test for membership in A_S (it's NP-hard/undecidable, even if we could define what A_S is!). So how is this definition useful?

When we find a vulnerability in a program a ∈ A, we patch it and get a new program a'∈ A (hopefully the acceptance criteria are improved to catch such vulnerabilities in the future, but we're ignoring that here). We know that a ∉ A_S, and by relying on a' in the future we are acting as though we think a' ∈ A_S. This gives us some evidence about (P, A)'s secure usability: if c(a') >= c(a) then the program had to be made more complicated in order to be fixed, so this is a counterexample against (P, A)'s strong secure usability.

We can actually do some empirical science here: Take some known vulnerabilities and find out if the patches increased or decreased the code's complexity. In the case of C, having to add manual bounds checking indicates poor secure usability. So does having to add a string escape for non-prepared-statements SQL queries. The "goto fail" bug is interesting because the patch is to remove a line of code, which according to these definitions mean it's not caused by a usability problem in the underlying language (or code size is a bad code complexity measure).

At some point I should come back and tinker with these definitions. The definition of strong secure usability doesn't actually make use of P, and we should somehow take into account A shrinking because of better testing. To talk about API usability we have to lump everything (including the programming language) into P. It would be nice to separate them somehow so that we can formalize statements like "prepared statements have better secure usability than string escaping."

defuse commented 7 years ago

My intuition for the complexity measure is that it should correlate well with the time the developer spends on the code. Starting with an empty file having 0 complexity, and increasing as the developer works. I want vulnerable programs having more complexity to be a good correlate for "vulnerable programs take more time to write."

The "goto fail" bug hints that we might want to capture something else, too: secure programs should be far (in the hamming distance sense) from vulnerable programs. The extra line could have been added with one key stroke (e.g. accidentally pressing p twice in vim), so in that sense "goto fail" points at poor usability in C. So here empirically we can look at vulnerability patches and check how many of them are very small, the smaller they are counting more against the usability of the language.

A better definition should take both of these into account.

blochberger commented 7 years ago

You could use the branching as a measurement of complexity. For example if you have a single statement the code is not complex, but if you add a condition, then the execution of the program depends on the condition and can take either this or that branch. Of course a definition of "branch" has to be found then.

Additionally, a program where you have an API where the user is required to manually check inputs before passing them to a function counts as more complex than a program where the input checking is hidden behind the function call. In the first case additional branching is performed for each place in the source code that calls a function from said API.

defuse commented 7 years ago

Yeah! I like that!

defuse commented 7 years ago

Here is an example of an antipattern I try to avoid in this library: instead of employing complex reasoning to "wontfix" a potential security bug, just change the code so that it's obvious there's no problem, unless the added code would be more complicated than the reasoning. Saves security auditors time :)

defuse commented 7 years ago

I think a conference talk is the right format for this, maybe with an accompanying document alongside the talk and slides. I think mathematically modelling usability is neat, but doesn't really lead to practical advice for developers: the intuitive "solutions that are vulnerable should be more complicated than solutions that are secure" along with some practical instruction about how to make that true is all that's needed. So, I'm going to close this issue.

defuse / php-encryption

Write a blog post about our API and the challenges we faced #203