mikejsavage commented 7 years ago

I'm not done yet, but here are my notes so far. I hope this is readable!

intro.html

Should explicitly mention that dictionary coders (i.e. every mainstream compression library) leaks info about the plaintext
Not related to intro.html but perhaps there should be a validation layer that checks things are aligned how they should be/things aren't overlapping when they shouldn't be/etc.

crypto_aead_lock.html crypto_aead_unlock.html crypto_lock.html crypto_unlock.html

Is it ok to use a counter for the nonce?
"Make sure message lengths do not leak secret information." - Again be explicit about dictionary coders
"That length is not authenticated." - I don't fully understand the problem or the solution here. You have to put the length in the authenticated data or it's not secure? Why doesn't monocypher do that for you? It's only 8 bytes.
In the in place encryption example the cipher_text variable is unused, except for the last comment where it is incorrect.
Do you need to call crypto_wipe even when crypto_unlock fails? Can it partially decrypt your data?

crypto_argon2i.html

"It can be used to encrypt private keys or password databases." - "it" is the resulting key?
"malloc() should yield a suitable chunk of memory." - malloc will (not should) return 8 byte aligned addresses on any normal platform. Might be worth explicitly mentioning that.
"The version provided by Monocypher has no threading support, so the degree of parallelism is limited to 1. This is considered good enough for most purposes." - I don't know why you mention this
"This most likely has no practical application but is exposed for the sake of completeness." - TBH I would not expose it in that case
"Use crypto_verify16(3monocypher), crypto_verify32(3monocypher) or crypto_verify64(3monocypher) to compare password hashes to prevent timing attacks, which are feasible against the weakest passwords." - maybe drop the bit after the comma

"The hardness of the computation can be chosen thus:" and the two bullet points are oddly phrased. How about:

To select the nb_blocks and nb_iterations parameters, you should first
decide how long the computation should take. For user authentication,
somewhere between half a second (convenient) and several seconds
(paranoid) is a good starting point, and you should use as much memory
as you can spare.

Since parameter selection depends on your hardware, you will need to
use some trial and error to find the ideal settings. Three iterations
and 100k blocks (one hundred megabytes of memory) is a good starting
point. Adjust nb_blocks first. If using all available memory is not
slow enough, increase nb_iterations.

crypto_blake2b.html crypto_blake2b_final.html crypto_blake2b_general.html crypto_blake2b_general_init.html crypto_blake2b_init.html crypto_blake2b_update.html

"It can authenticate messages with a naive approach." - Naive approach = just calling the functions? Is it still a naive approach if it works and is the intended usage?
"The output hash. Must have at least hash_size free bytes." - You don't normally say "Must have at least x free bytes".
"Use crypto_verify16(3monocypher), crypto_verify32(3monocypher), or crypto_verify64(3monocypher) to compare MACs created this way." - is it ok to use memcmp if you didn't use a key?
"Message to hash. May overlap with the hash argument where present." - What do you mean by "where present"? The hash and message are always present
"the length of the message in bytes." - should start with a capital T for consistency (I went back and looked at the argon2i page, you stop using capitals half down there too. Presumably this comes up elsewhere too)
"The direct interface has 2 functions" - I think it's considered good style to write two etc for small numbers. Doesn't really matter though

"crypto_blake2b(), provided for convenience (calling it is the same as calling crypto_blake2b_general() with a null key and a 64-byte hash)." - how about:

The direct interface has two functions, crypto_blake2b() and
crypto_blake2b_general(). crypto_blake2b() is provided for convenience,
and is identical to calling crypto_blake2b_general() with no key and a
64 bit hash.

crypto_chacha20_H.html

"The input in fills the space reserved for the nonce and counter. It does not have to be random." - huh?
"will be random iff the above holds" - capitalisation inconsistency again. Rather than "iff the above holds", you could say "iff key has enough entropy". It's not much longer
Should this function actually be exposed?

crypto_chacha20_encrypt.html crypto_chacha20_init.html crypto_chacha20_set_ctr.html crypto_chacha20_stream.html crypto_chacha20_x_init.html

TODO

crypto_check.html crypto_sign.html crypto_sign_public_key.html

No block describing the arguments!
TODO

crypto_check_final.html crypto_check_init.html crypto_check_update.html crypto_sign_final.html crypto_sign_init_first_pass.html crypto_sign_init_second_pass.html crypto_sign_update.html

TODO

crypto_key_exchange.html crypto_x25519.html crypto_x25519_public_key.html

I don't understand what crypto_x25519 is for. Is it an optimisation if you want to generate multiple keys? The old manual was more forceful about telling people to not use it unless they understand it.
I think the randomised key exchange part needs to be explained better. Why crypto_lock instead of crypto_sign?
There should probably be a function like "crypto_is_dangerous_key". If you're randomly generating keys for each session it would be nice to be able to check the key you're sending isn't going to be rejected for being malicious.

crypto_lock_auth.html crypto_lock_encrypt.html crypto_lock_final.html crypto_lock_init.html crypto_lock_update.html crypto_unlock_final.html crypto_unlock_update.html

TODO

crypto_memcmp.html crypto_zerocmp.html

Remove these functions? Or update them so they do run in constant time

crypto_poly1305.html crypto_poly1305_final.html crypto_poly1305_init.html crypto_poly1305_update.html

No block describing the arguments!
Not done reading this one yet

crypto_verify16.html crypto_verify32.html crypto_verify64.html

"Standard comparison functions tend to exit as soo as they find a difference" - "as soo" -> "as soon"

how about:

You will often need to compare secrets, or values derived from secrets.
Standard comparison functions (like memcmp) tend to exit when they find
the first difference, leaking information through timing differences.

"provide comparison functions whose timing is independent from the content of their input." - maybe: "provide comparison functions that run in constant time"

crypto_wipe.html

"Secrets and values derived from them should stay in memory for the shortest possible amount of time..." - how about:

crypto_wipe() securely erases sensitive data in memory.

Sensitive data (such as cryptographic keys or secret plaintexts) should
be erased from memory as early as possible, to minimise the window in
which it can be leaked.

Size argument should maybe be called secret_size.

ghost commented 7 years ago

Loup, forget about an "imminent" release. We'll be here for a while and for good reason.

First of all, thank you very much for going through this. You spotted a lot of things that went flew over everybody else's heads.

About dictionary coders in general, does that leak still exist if the compressed data including decompressed length is encrypted?

Note I won't be mentioning dictionary coder stuff from here on out anymore.

Please put the TODO items in a new issue comment once we've sorted these out or it'll be completely impossible to figure out what has and what hasn't been done.

intro

Not related to intro.html but perhaps there should be a validation layer that checks things are aligned how they should be/things aren't overlapping when they shouldn't be/etc.

Alignment: Is there any standardized way to check for that? In other words, is there a way to check alignment that is not implementation dependent (such as __attribute__)?

Overlapping: Probably possible to check, but I'm not sure about the performance hit if you process many small messages. See https://blogs.msdn.microsoft.com/oldnewthing/20170927-00/?p=97095 for some notes about pointer arithmetic and range checks. This may or may not break pretty bad on some embedded platforms.

I'd love @secworks to chip in on this part in particular.

crypto_lock

Is it ok to use a counter for the nonce?

Should be answered in the argument documentation for nonce: "A 24-byte number, used only once with any given session key. It does not need to be secret or random, but it does have to be unique." A counter is a 24-byte number that is used only once with any given session key that is unique.

"That length is not authenticated." - I don't fully understand the problem or the solution here. You have to put the length in the authenticated data or it's not secure? Why doesn't monocypher do that for you? It's only 8 bytes.

If you need a length field in the plaintext, you need to put it into the authenticated data. Otherwise, an attacker could modify the length field and cause a buffer overflow because your code worked with a tampered length field.

Consider a protocol where all messages have a fixed length or where you have a known good trailing character (after you check the MAC, obviously, else there's no known good). You may not particularly care about the length, especially not eight bytes of length for many small messages where you could've just used one or two bytes, too. YMMV, though. Loup may likely disagree with me here, even.

In the in place encryption example the cipher_text variable is unused, except for the last comment where it is incorrect.

+1

Do you need to call crypto_wipe even when crypto_unlock fails? Can it partially decrypt your data?

The model for crypto_unlock is this:

crypto_unlock()
Check return value. a. if the return value is non-zero (-1, actually), the message MAC mismatched and the message should never be touched by code. crypto_unlock has not even started decrypting because it checks the MAC first. You still need to wipe your key, but there will be no plaintext to wipe. b. if the return value is zero, the message MAC checks out and the entire message has been decrypted. You can trust the decrypted data.

This is probably a bit unclear in the current text; fixes welcome.

crypto_argon2i

"It can be used to encrypt private keys or password databases." - "it" is the resulting key?

"It" is the argon2i function. That is, you can use it to derive a key from a passphrase for private keys in asymmetric crypto, think GPG passphrases, or for password storage. This part is not clear enough, I agree.

"malloc() should yield a suitable chunk of memory." - malloc will (not should) return 8 byte aligned addresses on any normal platform. Might be worth explicitly mentioning that.

glibc docs: "The address of a block returned by malloc or realloc in GNU systems is always a multiple of eight (or sixteen on 64-bit systems)" (emphasis mine)

OpenBSD malloc(3): "The allocated space is suitably aligned (after possible pointer coercion) for storage of any type of object."

Assuming any particular alignment from malloc is questionable. Assuming meaningful alignment makes sense only from a practical purpose (the devs of the OS have to eat their own dog food). The C standard makes no guarantees about alignment or suitability of a block of memory returned by malloc. The wording in this case is about as cautious as I'd expect.

"The version provided by Monocypher has no threading support, so the degree of parallelism is limited to 1. This is considered good enough for most purposes." - I don't know why you mention this

Readers may already be familiar with argon2i and thus expect to be able to tune the parallelism parameter. This needs to be clarified here because it's confusing when you expect the parallelism parameter to be there but Monocypher doesn't actually allow for that. Additionally, it's important to know what the value is set to for interoperability. Interoperability is a particular concern for particular password storage.

"This most likely has no practical application but is exposed for the sake of completeness." - TBH I would not expose it in that case

See https://github.com/P-H-C/phc-winner-argon2/issues/151 -- There are probably applications but they're fairly niche? I'm on the fence with this one. If someone comes up with a really clever application for it, I don't want to people to be in the situation of having to forgo it, on the other hand it'd simplify invocation by a lot.

"Use crypto_verify16(3monocypher), crypto_verify32(3monocypher) or crypto_verify64(3monocypher) to compare password hashes to prevent timing attacks, which are feasible against the weakest passwords." - maybe drop the bit after the comma

+1, the comparison speed really shouldn't matter that much.

"The hardness of the computation can be chosen thus:" and the two bullet points are oddly phrased. How about: [...]

The sentences are fairly run-on in the first paragraph. I also adjusted "100k" in the second paragraph to be 100000 again for copy-pasteability and being less surprising to non-native English speakers.

To select the nb_blocks and nb_iterations parameters, it should first be
decided how long the computation should take.
For user authentication, somewhere between half a second (convenient)
and several seconds (paranoid) is a good starting point.
The computation should use as much memory as can be spared.

Since parameter selection depends on your hardware, some trial and error
will be required in order to determine the ideal settings.
Three iterations and 100000 blocks (that is, one hundred megabytes of
memory) is a good starting point.
Adjust
.Fa nb_blocks
first.
If using all available memory is not slow enough, increase
.Fa nb_iterations .

crypto_blake2b

"It can authenticate messages with a naive approach." - Naive approach = just calling the functions? Is it still a naive approach if it works and is the intended usage?

Indeed. The wording may be a bit questionable. Background: SHA-2 and other common hashing functions require a special construction called HMAC to use them as message authentication functions. However, BLAKE2b does not require that. Just calling the function with a suitably long key parameter turns it into a valid message authentication function.

"The output hash. Must have at least hash_size free bytes." - You don't normally say "Must have at least x free bytes".

* Must be a buffer of at least hash_size bytes?

"Use crypto_verify16(3monocypher), crypto_verify32(3monocypher), or crypto_verify64(3monocypher) to compare MACs created this way." - is it ok to use memcmp if you didn't use a key?

If you didn't use a key, it's not a MAC. You can use memcmp in this case. May need clarification.

"Message to hash. May overlap with the hash argument where present." - What do you mean by "where present"? The hash and message are always present

s/ where present//

Should probably also be double checked that it's "overlap", not just "point at the same address".

"the length of the message in bytes." - should start with a capital T for consistency (I went back and looked at the argon2i page, you stop using capitals half down there too. Presumably this comes up elsewhere too)

+1, good catch

"The direct interface has 2 functions" - I think it's considered good style to write two etc for small numbers. Doesn't really matter though

I just skimmed over mdoc(7) and Practical UNIX Manuals. I thought I'd read something about always writing digits, but I guess I was wrong.

Should be corrected to "two". "Small numbers" is generally one through twelve, inclusive. Should probably be ignored where we're talking about return values.

"crypto_blake2b(), provided for convenience (calling it is the same as calling crypto_blake2b_general() with a null key and a 64-byte hash)." - how about: [...]

Suggestion works for me.

crypto_chacha20_H

"The input in fills the space reserved for the nonce and counter. It does not have to be random." - huh?

This only makes sense if you're familiar with Chacha20 itself. Most likely needs rewording to clarify it's about the Chacha20 block setup.

"will be random iff the above holds" - capitalisation inconsistency again. Rather than "iff the above holds", you could say "iff key has enough entropy". It's not much longer

(Context: It's in the example)

+1 for the capitalization.

It's not much longer, but it'll be long enough that it'll be longer than 80 characters when displayed on a terminal, so you'd need to have a multi-line comment.

Should this function actually be exposed?

Design decision; deferring. IMO no. Not sure why it's there.

crypto_sign, crypto_poly1305

No block describing the arguments!

You're very right about that. Not sure if it really needs one. We'll probably be better off revisiting these once you're past the TODO.

crypto_key_exchange

I don't understand what crypto_x25519 is for. Is it an optimisation if you want to generate multiple keys? The old manual was more forceful about telling people to not use it unless they understand it.

It's the raw underlying primitive. crypto_key_exchange is what you get when you crypto_x25519 and then crypto_chacha20_H the result from crypto_x25519. We should probably tell people not to touch the thing in the normal case.

I think the randomised key exchange part needs to be explained better. Why crypto_lock instead of crypto_sign?

Because crypto_sign uses asymmetric cryptography. In asymmetric cryptography, you have a secret key (aka private key) and a public key. You publish the public key to everyone who should be able to verify your messages. But you keep the secret key a secret so that nobody but you can sign messages. Think of it like this: A government signs its laws with asymmetric cryptography. You want everybody to verify the laws being untampered, so everybody gets the public key. You want nobody but the government to sign the laws, so only the government gets the secret key. The crypto_key_exchange family of functions does the opposite of that: The result of a key exchange is a single, identical key between all participants in the key exchange.

crypto_lock uses symmetric cryptograhy. In symmetric cryptography, you have one secret key (ignoring details about authentication here). You do not want anybody to have the keys other than the communication participants, but the participants all need the same key to communicate.

Confusingly, you need an asymmetric key pair for crypto_key_exchange to generate a shared symmetric key.

Did that clear it up? Got any concrete idea how to improve the page with that new knowledge?

There should probably be a function like "crypto_is_dangerous_key". If you're randomly generating keys for each session it would be nice to be able to check the key you're sending isn't going to be rejected for being malicious.

There's a total of seven keys. That's odds of seven in 115792089237316195423570985008687907853269984665640564039457584007913129639936. These odds are negligible. In fact, if this random generation happens, whoever runs into it is the luckiest person alive, probably.

That said, you do raise a point: All-0 is one of the broken keys and the only way to get that is if your random number generator has brain damage (or /dev/urandom got symlinked to /dev/null). It's less a key validity check than a RNG compromise check, however. Not sure what to do about this.

crypto_memcmp

The functions have been removed, as it says on the manual page. However, the manual page itself has been kept around so that people who use an old version of Monocypher can (1) notice they're on an old version of Monocypher, and (2) have a clearly marked upgrade path.

Updating for constant time is not an option. #38 has proven fairly impossible to solve. OpenBSD exposes timingsafe_memcmp in the stdlib, but it's an outlier.

crypto_verify16

"Standard comparison functions tend to exit as soo as they find a difference" - "as soo" -> "as soon"

+1

how about: [...]

Not a fan of actively talking to the reader with "you". Rest of the changes to the first paragraph are okay.

"provide comparison functions whose timing is independent from the content of their input." - maybe: "provide comparison functions that run in constant time"

Technically, they're not constant-time. The execution time depends on the length parameter (16, 32, 64). Though given the length parameter is now hardcoded into the function, +1 for the change.

crypto_wipe

"Secrets and values derived from them should stay in memory for the shortest possible amount of time..." - how about: [...]

No objections other than maybe taking out the paragraph there, but that's very debatable and I have no strong feelings either way.

Size argument should maybe be called secret_size.

I have no strong feelings either way.

LoupVaillant commented 7 years ago

Loup, forget about an "imminent" release.

Indeed. I love what I'm seeing so far, this will be worth it. Now my 2 cents (I haven't commented on what I already agree with):

dictionary coders

I believe compression is indeed responsible for the recovery of encrypted audio through traffic analysis. I agree we should mention them explicitly. If I got that correctly, the length of a compressed output more or less reveals how much information the message contains. Note that not compressing the message at all may not help —the actual message length is still revealed. I think the most important point here is that variable-length encodings, of any kind, reveal more information than fixed-length encodings. I'm not sure how to say it best yet.

About dictionary coders in general, does that leak still exist if the compressed data including decompressed length is encrypted?

Encrypted data is indistinguishable from random. Entropy is maximised, you cannot compress that. If you find a way to reliably compress encrypted data, that's a proof the cipher is broken beyond repair —you basically need to recover the plain text to compress the stream.

The main problem with compression occurs when you send many messages. I can see only three solutions to mitigate this:

Packing several messages into one, and send that. That way you reveal less lengths. This increases latency.
Padding the messages. After the compression. That way each revealed length contains less information. This increases bandwidth.
Use a fixed-length encoding. That way lengths reveal nothing other than communication is being performed. This has to sacrifice either bandwidth or quality, assuming fixed-length encodings have a worse compression rate.

Validation layer

Monocypher's policy is that the user is responsible for providing correct inputs. No validation is being done. This mostly comes from the C language: the simplest interfaces are unsafe. I don't see a way around that. I could have added error codes, but then users would feel the need to check them, complicating their code further. That said, Monocypher does try pretty hard to be robust. Most buffers can overlap arbitrarily, as checked by the test suite.

I do think however we could increase safety when binding Monocypher to other languages. C++ and OCaml will probably have some type based compile time guarantees, Lua and Python will have runtime checks.

Authenticated encryption

It is okay to use a counter for the nonce. It's just a little harder. We just have to make sure to increment that counter, and also not using the same stream of counters if we're doing dual channel communication.

The reason the length of the additional data is not automatically authenticated is because it makes the absence of additional data equivalent to a zero-length additional data. This ensures authenticated encryption of both kinds (with and without additional data) are compatible with each other. This also makes the whole thing simpler —less room for me to botch the construction, easier incremental interface for the user.

I have estimated that the odds of misuse are minimal: messages and parts thereof have to have either an implicit size (fixed or not), or they must encode their own size. The only mistake one could make is to rely on the (insecure) transport protocol to communicate the size. But even that is rarely a problem, since the size of the whole message (additional data and secret message) is authenticated. Problems only occur when we send the additional data separately, and make the mistake of relying on the transport protocol to determine its length. In all other cases, authenticating the length at the Monocypher level is just redundant.

Argon2i

malloc() will yield a suitable chunk of memory indeed. The standard doesn't guarantee any alignment, but we don't care as long as we can put a uint64_t array in there.

I agree having to append 0, 0, 0, 0 at the end of the argument list is a bummer. I insist we keep them however. There are legitimate (though niche) cases for both a key and additional data, and having the user concatenate them himself is error prone. Had I designed this myself, I would probably hash all the inputs in a single Blake2b hash, then run the rest of Argon2 on that. That way users could have taken advantage of the streaming interface of Blake2b to hash several buffers without copying anything. I didn't, so conforming to the existing design is easier.

How about this:

argon2i_general(hash, 32, work_area, pass, 100000, 3, pass_size, salt, 16, 0, 0, 0, 0);
argon2i        (hash, 32, work_area, pass, 100000, 3, pass_size, salt, 16);

That way we're consistent with Blake2b's naming convention. It breaks compatibility, but the compiler will provide a suitable error message (too many arguments).

Blake2b

The naive approach is concatenating the session key and the message, then hash that. That is, compute H(key|message), and transmit that MAC. Problem is, older hashes reveal their own internal state in the result, so an attacker can use that information to compute H(key|message|forgery), and transmit that instead. Blake2b is immune to this attack. Thanks to that, it can (and does) use the naive approach for its "keyed mode".

memcmp() and the like should only be used to compare non-secrets. If the hash itself is secret, or the hash input was secret, best be safe and use crypto_verify*(). The manual probably needs some clarification.

HChacha20

The reason why I expose this is because we need it for the key_exchange() function.

Monocypher has 2 layers: low level functions, that regular users should not use (Chacha20, HChacha20, Poly1305, and X25519); and high level functions meant for the users. The idea is to implement all high-level functionality on top of low level functions. Advanced users should be able to reconstruct the high-level stuff from the low-level primitives. Without HChacha20, they couldn't reconstruct crypto_key_exchange(), so I've decided to expose it.

Another policy could be to not expose low-level primitives at all. I don't think this would be a good idea: what if some functionality is missing? And I have an example from a few days ago: random streams. Some users may want to use a user-space random number generator despite all the warnings. I don't know, maybe there's a legitimate use case. Well, Monocypher doesn't have an RNG with proper key erasure. But users can build that on top of crypto_chacha20_stream() if they know what they're doing. We could also add it to Monocypher itself, but that could be seen as an endorsement, so I'd rather not.

Now I'd definitely consider omitting omit low-level primitives in language bindings.

Key exchange

We should probably tell people not to touch [X25519] in the normal case.

It is indeed a low-level function, the manual should have stated that clearly.

I do see a legitimate (though niche) use case: dual channel. Instead of using HChacha20 to hash the shared secret, use Blake2b. With a 64 bytes hash, you get two keys, to facilitate dual channel communication: having each party use its own key facilitates nonce management (when nonces are counters).

Keep in mind that public keys are not generated randomly. Only the private key is. And the private key isn't a point on the curve, it's a scalar. Genuine public keys are always on the curve, by construction (it's the base point times the private key). I haven't checked, but I'm pretty sure the evil keys are all outside the curve. As such, they would be mathematically impossible to generate by accident. Now if you have hardware faults…

Crypto wipe

secret_size indeed. I don't like it, but I reckon it's more consistent with the rest of the manual.

ghost commented 7 years ago

It breaks compatibility, but the compiler will provide a suitable error message (too many arguments).

We're breaking compatibility in this general area by removing crypto_memcmp anyway. We can afford breakign argon2i's invocation while we're at it.

+1 for the proposed interface.

Nothing else to tack on.

mikejsavage commented 7 years ago

re the dictionary coder thing, LZ compression basically has two modes of operation:

send some new data (literals)
send some data we have already seen (matches)

So if we wanted to compress the string "hello world", there are no repeated substrings long enough to make it worth sending a match and the whole thing would be sent as a literal. If instead we compressed "hello hello", that would get sent as [literal:hello ][match: offset = -6, length = 5]. "hello hellp" would get sent as [literal: hello ][match:offset -6, length = 4][literal:p], etc. It's important to note that matches use fewer bytes than literals (or it wouldn't be compressing!)

The issue with mixing LZ + encryption is that it enables a chosen plaintext attack. The example I know is that if an attacker can force the client to send HTTPS requests to the server (e.g. by using javascript to repeatedly submit a form), the request contains the user's session cookie, and the attacker controls the request body. Then the attacker can keep sending random request bodies until the compressed + encrypted data becomes a few bytes shorter, meaning the request body matches something that appeared earlier in the request, preferably the user's session cookie.

I guess similar attacks exist for other compression algorithms but it's by far the easiest with LZ, and every mainstream compressor is LZ + something else.

Will respond to the rest and do some more reading this evening

LoupVaillant commented 7 years ago

Okay, dictionary coders are more serious than I thought. Definitely needs its own subsection, either in the introduction or in the authenticated encryption page.

mikejsavage commented 7 years ago

(also mostly not commenting on things I agree with)

Re CuleX:

This is probably not the right place to talk about extra code, but with the validation layer thing I meant something like having a compile time flag that makes monocypher abort if you get things wrong.

Not a fan of actively talking to the reader with "you". Rest of the changes to the first paragraph are okay.

The manual isn't totally consistent about that atm so I wasn't sure. At a glance it looks like general advice talks about the user, specific advice and instructions talks about you

crypto_blake2b

Must be a buffer of at least hash_size bytes?

My point was that the manual doesn't say this kind of thing anywhere else (or if it does it's not consistent)

eg from argon2i, hash Buffer for the output hash., only the docs for hash_size mention the relationship between hash and hash_size.

crypto_key_exchange

I was talking about the paragraph under security considerations, where it talks about forward secrecy:

Users who want forward secrecy need to generate temporary public keys, send them to one another, (use crypto_lock(3monocypher) to authenticate them), and compute a shared secret with those temporary keys.

I thought randomised key exchange would look something like this:

Both sides know each other's public keys
Both sides generate a new random key pair
Both sides sign the new public keys with their pre shared public keys and send them to each other
Both sides can authenticate the new public keys
Both sides do key_exchange with the new public keys

but the bit about crypto_lock suggests something like this:

Both sides do key_exchange with the pre shared public keys
Both sides generate new random key pairs
Both sides send the new public keys to each other, using crypto_lock with the key derived earlier
key_exchange again

Is there any reason to prefer one over the other? Seems like the latter uses an extra pair of messages

crypto_memcmp

OpenBSD exposes timingsafe_memcmp in the stdlib, but it's an outlier.

Yeah that's what I was thinking of. I suppose the big difference there is that OpenBSD is shipped precompiled so they can make sure it works on the platforms they target.

Re Loup:

Monocypher has 2 layers: low level functions, that regular users should not use (Chacha20, HChacha20, Poly1305, and X25519); and high level functions meant for the users.

I see!

I haven't checked, but I'm pretty sure the evil keys are all outside the curve. As such, they would be mathematically impossible to generate by accident.

That's good to know, maybe worth mentioning in the manual

mikejsavage commented 7 years ago

Alright here's the pages I didn't do yesterday:

crypto_unlock

text_size argument is not documented!

crypto_chacha20_encrypt.html crypto_chacha20_init.html crypto_chacha20_set_ctr.html crypto_chacha20_stream.html crypto_chacha20_x_init.html

No args description block
crypto_chacha20_init probably shouldn't be exposed? It's only used inside chacha20_x_init
"The recommended initialisation routine is crypto_chacha20_x_init()." -> "crypto_chacha20_x_init is recommended over crypto_chacha20_init."
"Once the encryption is done, the context is updated to allow subsequent calls of crypto_chacha20_encrypt() on the rest of the message, if any." -> "You may call crypto_chacha20_encrypt repeatedly with the same context struct to encrypt a message incrementally."
Should that be called crypto_chacha20_update instead? (low level API so probably doesn't matter)
"The plain_text and cipher_text may point to the same buffer for in-place encryption." - drop the first "The"
"The plain_text pointer is allowed to be NULL (0)" -> maybe drop the "(0)", maybe rephrase as "plain_text can be NULL"
"Decryption is the same as encryption." - may be worth explaining this a little more, how about "Since XOR is its own inverse, decryption is the same operation as encryption."
"To Decrypt the cipher_text," - lose the capital D
From the examples, "/ The plain text likely needs to be processed before you wipe it /", maybe add crypto_wipe(plain_text) below that
"Encryption chunk by chunk (same as simple encryption):" -> "Chunk by chunk encryption" or "Incremental encryption" (you say incremental on other pages)
"In place encryption (same results as simple encryption):" - this example is probably unnecessary
"Encryption by jumping around (don't do that):" - don't include it as an example then!
Major nitpicking, in the code you have if (plain_text != 0) and if (plain_text) in crypto_chacha20_encrypt

crypto_check.html crypto_sign.html crypto_sign_public_key.html

No block describing the arguments!
"computes a public key from the specified secret key. That key must be random." -> "The secret key must be random."
"The key pairs generated here are not the same as those used for crypto_key_exchange(3monocypher). Do not use the same private key for both purposes." - might be worth including a similar note in the key_exchange manual
"To ascertain the origin of a secret message, use crypto_key_exchange(3monocypher) instead." - can you sign an encrypted message to achieve the same thing, but without a roundtrip?
"crypto_check() returns zero for legitimate messages and -1 for forgeries." - maybe write 0 since it's a return value (you say return 0 elsewhere)
The security considerations paragraph seems insane and not worth mentioning!

crypto_check_final.html crypto_check_init.html crypto_check_update.html crypto_sign_final.html crypto_sign_init_first_pass.html crypto_sign_init_second_pass.html crypto_sign_update.html

No args block (it does say "The arguments are the same as described for the direct interface on crypto_sign(3monocypher).". If you want to keep that I would reword it slightly, "The arguments are the same as the direct interface described in crypto_sign.")
"This incremental interface allows splitting up decryption and authentication messages too large to fit into a single buffer" - "This incremental interface allows you to sign or verify messages too large to fit in a single buffer."
Maybe drop the "or to handle individual pieces of data located in different buffers"
The instructions might need to be a bit more detailed. Particularly "x pass proper" and "Update with crypto_check_update".
"returns zero" again
In the first example, qfor -> for

crypto_lock_auth.html crypto_lock_encrypt.html crypto_lock_final.html crypto_lock_init.html crypto_lock_update.html crypto_unlock_final.html crypto_unlock_update.html

"These functions are variants of ..." - I would be explicit and say "These functions provide an incremental interface to ..."
"This incremental interface allows splitting up decryption and authentication of messages too large to fit into a single buffer" again
"The arguments are the same as described for the direct interface on crypto_lock" again
"This interface uses four steps:" -> "Incremental encryption is done in four steps:"
"crypto_lock_encrypt() encrypts or decrypts data without authenticating it. It is meant as a building block." - should that be in another man page?
"Corruption can happen because of transmission errors, programmer error, or attacker interference. Always check the return value." - you don't say anything like this on the pubkey crypto page at least
Maybe wipe the plaintext in the decryption example.
In place encryption example is probably unnecessary

LoupVaillant commented 7 years ago

Excellent, thank you! I agree with almost all your proposed changes, so this will be simple.

I can propose 2 workflows:

I can make the changes you suggested myself, one commit per page. Then you can comment, either on the patch or here. Then I make (hopefully) final adjustments to the page. I can work on a different page while you comment on an older one.
You can make the changes you suggested yourself, then I merge your pull request and make my final adjustments. More work for you, especially if you're not familiar with man page editing, or if you don't like my 72 characters OCD. You may like the visibility, though.
Edit: or you could just feel content and leave the rest to us. You certainly deserve to feel content :smile:.

Which do you prefer?

In the meantime, my comments:

Validation layer

I meant something like having a compile time flag that makes monocypher abort if you get things wrong.

This would complicate the source code, and would only be partial. I'm against it.

But.

I am not against a mechanism that would not touch the source code. With some macro magic, we could replace all calls to crypto_* with calls to crypto_validation_* functions, and have them perform lots of tests, some of which would be meant for Valgrind, or sanitisers such as MSan or UBSan. Doing this properly is likely to be a significant undertaking. I don't feel like doing it myself any time soon. I would however accept contributions.

Usage of "you"

The manual doesn't match my usual writing style. I'm usually direct, personal, and pretty familiar. The manual is meant to be more neutral and "professional" looking. Inconsistencies naturally remain.

The style I try to settle on is:

Avoid using the word "you", except when it's awkward.
Use the imperative form when providing unambiguous directives.

(Fun fact: the above list follows its own advice.)

Buffer sizes

Variably sized buffers should all be commented the same way. I personally vote for using the buffer_size argument to document the size. For instance:

hash: Output hash.
hash_size: Size of the output hash.

I will let you two override me if you agree on some other formulation.

Key exchange vs signatures

There are 2 cases to consider, depending on whether Alice and Bob knows each others X25519 public keys, or if they know each other's EdDSA public keys. In both cases, they only need 3 messages total to establish a secure channel.

Let's start with X25519. Alice already has an alice_sk, alice_pk x25519 key pair, and Bob knows alice_pk. Bob already has a bob_sk, bob_pk x25519 key pair, and Alice knows bob_pk.

Alice randomly generate a tmp_alice_sk.
Alice calls crypto_x25519_public_key(tmp_alice_pk, tmp_alice_sk).
Alice calls crypto_key_exchange(tmp_key, tmp_alice_sk, bob_pk).
Alice generates an all-zero 24-byte nonce.
Alice use crypto_aead_lock() to authenticate the ephemeral public key.
Alice sends mac and tmp_alice_pk

Bob receives Alice's messages. After authentication, Bob knows he got a genuine message from Alice, or a replay from an attacker.
Bob does the same as Alice, except he'll be using Alice's ephemeral public key. Now that both parties have generated an ephemeral key, they have an ephemeral shared symmetric key to exchange messages.
Bob uses the ephemeral shared symmetric key to authenticate his own ephemeral public key, which he then sends to Alice.

Alice receives Bob's message. She knows it came from Bob, and it wasn't a replay (it used the ephemeral shared secret).
Alice authenticates her temporary public key again, this time with the ephemeral shared key. She then sends the result to Bob. Warning: she must use a different nonce than Bob.

Bob receives Alice's message, an now he knows it's not a replay. The secure channel has been established, they can now communicate.

The same can be done with signatures. The generation of ephemeral public keys is the same, they just sign it with EdDSA instead of the key_exchange + crypto_lock above.

Sleep time. I will respond to the rest tomorrow

LoupVaillant commented 7 years ago

evil public key

A quick search couldn't confirm that evil public keys cannot be generated the legitimate way, but the original paper seems to assume it's self evident (and thus doesn't confirm anything :angry:). I'm still tempted to pretend they cannot, because the chances are really negligible even if they can. Something like "public keys generated with crypto_x25519_public_key() are always contributory" or something.

misc

crypto_chacha20_init should be exposed, so users can use regular Chacha20 if they know what they're doing (Chacha20 is slightly faster to initialise), or for compatibility with some protocol (I'm thinking IETF). It's also a standard primitive, on top of which crypto_chacha20_x_init() is implemented.

I've considered crypto_chacha20_update, then decided against it. First, I need to differentiate it from crypto_chacha20_stream, and second, there is no crypto_chacha20_final (its only purpose would be to wipe the context, and we have crypto_wipe() for that).

Encryption by jumping around is meant to demonstrate crypto_chacha20_set_ctr, which does have legitimate, but harder to demonstrate, uses. Maybe I should write that much?

Hardware faults can happen for real with non-hardened smart cards or USB keys if you insert into an untrusted power source (which it always is, that's why the dongle does its own encryption in the first place). Though those who implement that are likely to be aware of the issues to begin with. The consequences are also dire: full recovery of a long term private key, that's the worst catastrophe possible. While I'm not sure about being worth mentioning, that paragraph isn't as insane as it sounds. Maybe we could shorten it?

The only reason to check the public key is when you're not sure about contributory behaviour. Public keys you already know and trust don't have to be checked. Also, failure to check for contributory behaviour is not that bad: on a 1 on 1 conversation, the worst thing that can happen is that you broadcast your responses instead of writing back to the attacker. Message authentication code however are different for each message, and do need to be checked each and every time if you don't want to expose yourself to full blown forgeries. It's also pretty easy to forget to check the MAC, since you don't even need that to carry on. That said, I do agree we should put a few more words in the security consideration section of key exchange.

mikejsavage commented 7 years ago

I don't mind making PRs

Will go through the pages in the same order as listed in my comments here

LoupVaillant commented 7 years ago

Wonderful. Until the next PR, then.

LoupVaillant commented 7 years ago

Merged #59, #60, and #61, thanks! I'll enact my comments myself so you can concentrate on the other pages. Good job. :+1:

mikejsavage commented 7 years ago

Thanks

Will try to knock out the rest this weekend

On 23 Nov 2017, at 2:15 am, Loup Vaillant notifications@github.com wrote:

Merged #59, #60, and #61, thanks! I'll change some last couple words tomorrow so you can concentrate on the other pages. Good job. 👍

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

LoupVaillant commented 7 years ago

Nice!

mikejsavage commented 7 years ago

I got sick this weekend! But I made some new PRs tonight.

Still TODO:

crypto_key_exchange
crypto_poly1305
crypto_lock_auth
adding missing args blocks?
checking capitalisation consistency in args blocks and example comments
checking the examples
second pass over everything

LoupVaillant commented 7 years ago

Ah, hope you're better. I'll review your PRs tomorrow, thanks!

LoupVaillant commented 7 years ago

(And yes, we should add the missing arg blocks. I'll let you decide if you want to do it.)

LoupVaillant commented 7 years ago

All merged, thanks! Also thanks @CuleX for the review. I would have missed what you spotted.

mikejsavage commented 6 years ago

PRs up for incremental AEAD and Poly1305

mikejsavage commented 6 years ago

btw are the man pages 70 chars wide or 72?

I've been using 72 and rewrapping things without thinking about it. I took a closer look and it looks like you're using 70

ghost commented 6 years ago

I couldn't find any authoritative answer, but I've personally always rolled with 72. It just frequently happened that I had to line break earlier.

mikejsavage commented 6 years ago

let's see what Loup has to say about that one

crypto_key_exchange PR is up too

LoupVaillant commented 6 years ago

I'll look the PRs tomorrow, thanks.

About 70 vs 72 columns, I don't care much. I use the default settings of Emacs' auto-fill mode, which happens to be set at 70 columns (which I didn't know). 72 are fine, don't bother wrapping at 70 (And as @CuleX hinted, it gives the same result in many cases anyway.)

I'll think about changing my settings though, I've always though of 72 as the canonical width.

LoupVaillant commented 6 years ago

Okay, all merged. I'll change a couple lines myself this week-end. Looks like we'll be able to wrap this up before Christmas!

ghost commented 6 years ago

@LoupVaillant @mikejsavage About Argon2i, I just remembered that the implementation in Monocyphers diverges from the broken reference implementation (non-HTTPS link).

Should we put that into the crypto_argon2i man page? It may affect interoperabilty with other systems.

On the topic of Argon2i: Should we consider Argon2id over Argon2i? I think it wasn't on the radar when work on Monocypher started, and now it's a thing apparently. From the RFC draft:

(p. 3) Argon2 has one primary variant: Argon2id, and two supplementary variants: Argon2d and Argon2i.

(p. 13) We recommend the following procedure to select the type and the parameters for practical use of Argon2.

Select the type y. If you do not know the difference between them or you consider side-channel attacks as viable threat, choose Argon2id.

LoupVaillant commented 6 years ago

Actually, Monocypher does not diverge. I have chosen to be bug-compatible. The authors of Argon2 estimate the bug has a negligible impact, and their arguments made sense.

I am aware of Argon2id, and I don't trust it much. If there are side channel attacks, the "d" part of Argon2id is broken, and the rest is easier to brute force. It's sort of a middle ground between full side channel protection and mathematical safety.

I could implement Argon2d and Argon2id later, it wouldn't take much code.

ghost commented 6 years ago

Noted about the bug compatibility. At this point it might be easier to just change the spec to match reality.

@LoupVaillant It seems to me that Monocypher is an opinionated library. I only brought up Argon2id because the RFC seems to heavily push it and I'm not a domain expert on any of these algorithms. Argon2i already appears to be a sane default. I'm not sure if there's much benefit to be had from just having more functions for the sake of having them.

LoupVaillant commented 6 years ago

At this point it might be easier to just change the spec to match reality.

The devs promised that much, but it doesn't show up on the latest spec yet.

I'm not sure if there's much benefit to be had from just having more functions for the sake of having them.

Neither am I, which is why I'll only add them if there's explicit demand.

ghost commented 6 years ago

Maybe they forgot about it or something? You may want to nudge again once it's been >6 months.

~~Docs feedback, issue 58: Actually just a catch-all.~~

LoupVaillant commented 6 years ago

OK, I'm about to close this issue as "good enough" and release the next version of Monocypher.

Any last words?

mikejsavage commented 6 years ago

I can look over your recent changes tonight or tomorrow, would still like to do another read through everything but fair enough if you don't want to wait for that

mikejsavage commented 6 years ago

not been that many changes since my last one, I'll take a look at them tonight

ghost commented 6 years ago

@LoupVaillant I'd say "hold it" for @mikejsavage's second pass in that case.

Besides, it's holiday season. Landing on HN with a new release is going to be a complete crapshoot, especially since everyone's memeing on 34c3 now, too. The timing would be rather unfortunate.

LoupVaillant commented 6 years ago

I agree, I'll wait for @mikejsavage's feedback (thank you for being available, by the way).

As for the HN submission, I was thinking I could hold it until I set up monocypher.org (properly, with TLS).

LoupVaillant commented 6 years ago

Oh, and if any of you feel like doing a last read through everything, I will also wait for that. While I am personally tired of working on this manual, I don't mind some more waiting at all.

ghost commented 6 years ago

Not going to touch it anymore. I've got text revision burnout since completing my thesis.

mikejsavage commented 6 years ago

I made some small comments on various commits, which i now realise is a big pain in the ass

will package them up into a pr tomorrow, then try to read through the rest again

LoupVaillant commented 6 years ago

A PR would be wonderful, thanks. I basically agree with all your suggestions, by the way.

mikejsavage commented 6 years ago

args blocks are somewhat inconsistent between "The length of the output hash in bytes." and "The length of the output .Fa hash in bytes."

I think it should either by the first or "The length of .Fa hash in bytes." but i don't care that much

any opinions?

edit: i am changing "The length of the output .Fa hash in bytes" to not have the .Fa and leaving the rest as is

mikejsavage commented 6 years ago

btw why isn't the message arg to blake2b/blake2b_general const?

ghost commented 6 years ago

args blocks are somewhat inconsistent between "The length of the output hash in bytes." and "The length of the output .Fa hash in bytes."

I think it should either by the first or "The length of .Fa hash in bytes." but i don't care that much

IMO, always use .Fa if it fits.

btw why isn't the message arg to blake2b/blake2b_general const?

Because I was apparently half-asleep when I wrote it. const may be off with the actual prototypes at other points, now that I think about it. Scary thought. I'll go ahead and give function prototypes another pass.

mikejsavage commented 6 years ago

last arg to void crypto_chacha20_stream is called size in the header, stream_size in the manual

ghost commented 6 years ago

last arg to void crypto_chacha20_stream is called size in the header, stream_size in the manual

@LoupVaillant This probably requires a code fix.

(deleted comment) const may be off with the actual prototypes at other points, now that I think about it. Scary thought. I'll go ahead and give function prototypes another pass.

I am doing that now fyi

I see you deleted the comment. I'll assume that means you're not doing it?

mikejsavage commented 6 years ago

chacha20 examples use sizeof for buffer sizes, other pages don't

should we fix the examples to only use one or the other?

mikejsavage commented 6 years ago

I see you deleted the comment. I'll assume that means you're not doing it?

yeah sorry I misunderstood. I am checking the prototypes match what the man pages say

ghost commented 6 years ago

chacha20 examples use sizeof for buffer sizes, other pages don't

should we fix the examples to only use one or the other?

I'm certain that we've agreed some time ago to avoid sizeof for the examples. I don't particularly feel like searching the issue that was decided in. Could've also been some comment on a commit.

yeah sorry I misunderstood. I am checking the prototypes match what the man pages say

Here are the ones I found, carry on after those:

index af4984a..569c50b 100644
--- a/doc/man/man3/crypto_blake2b.3monocypher
+++ b/doc/man/man3/crypto_blake2b.3monocypher
@@ -14,16 +14,16 @@
 .Ft void
 .Fo crypto_blake2b
 .Fa "uint8_t hash[64]"
-.Fa "uint8_t *message"
+.Fa "const uint8_t *message"
 .Fa "size_t message_size"
 .Fc
 .Ft void
 .Fo crypto_blake2b_general
 .Fa "uint8_t *hash"
 .Fa "size_t hash_size"
-.Fa "uint8_t *key"
+.Fa "const uint8_t *key"
 .Fa "size_t key_size"
-.Fa "uint8_t *message"
+.Fa "const uint8_t *message"
 .Fa "size_t message_size"
 .Fc
 .Ft void
@@ -34,7 +34,7 @@
 .Fo crypto_blake2b_general_init
 .Fa "crypto_blake2b_ctx *ctx"
 .Fa "size_t hash_size"
-.Fa "uint8_t *key"
+.Fa "const uint8_t *key"
 .Fa "size_t key_size"
 .Fc
 .Ft void
diff --git a/doc/man/man3/crypto_lock_init.3monocypher b/doc/man/man3/crypto_lock_init.3monocypher
index c9371c4..61bfd35 100644
--- a/doc/man/man3/crypto_lock_init.3monocypher
+++ b/doc/man/man3/crypto_lock_init.3monocypher
@@ -21,8 +21,8 @@
 .Ft void
 .Fo crypto_lock_auth
 .Fa "crypto_lock_ctx *ctx"
-.Fa "const uint8_t *ad"
-.Fa "size_t ad_size"
+.Fa "const uint8_t *message"
+.Fa "size_t message_size"
 .Fc
 .Ft void
 .Fo crypto_lock_update
diff --git a/doc/man/man3/crypto_poly1305.3monocypher b/doc/man/man3/crypto_poly1305.3monocypher
index a25cebc..6b76378 100644
--- a/doc/man/man3/crypto_poly1305.3monocypher
+++ b/doc/man/man3/crypto_poly1305.3monocypher
@@ -30,7 +30,7 @@
 .Ft void
 .Fo crypto_poly1305_final
 .Fa "crypto_poly1305_ctx *ctx"
-.Fa "const uint8_t mac[16]"
+.Fa "uint8_t mac[16]"
 .Fc
 .Sh DESCRIPTION
 Poly1305 is a one-time message authentication code.

mikejsavage commented 6 years ago

lock_auth args are called ad/ad_size in the manual, message/message_size in the header

ghost commented 6 years ago

@mikejsavage Fix the manual's definitions for lock_auth. Authentication is for messages, ad is for AEAD constructions. CC @LoupVaillant

mikejsavage commented 6 years ago

crypto_lock_encrypt seems like a dangerous function!

it's got the same signature as crypto_lock_update, has no warning in the header, and is actually grouped in the middle of the normal functions in the header while lock_update is at the bottom

EDIT: let's just make a list of action points for loup once this is done so he doesn't have to trawl through all of this

mikejsavage commented 6 years ago

crypto_lock_auth is the function that incrementally adds additional data, so I think the manual is right here

LoupVaillant / Monocypher

Docs feedback #58

intro

crypto_lock

crypto_argon2i

crypto_blake2b

crypto_chacha20_H

crypto_sign, crypto_poly1305

crypto_key_exchange

crypto_memcmp

crypto_verify16

crypto_wipe

dictionary coders

Validation layer

Authenticated encryption

Argon2i

Blake2b

HChacha20

Key exchange

Crypto wipe

crypto_blake2b

crypto_key_exchange

crypto_memcmp

Validation layer

Usage of "you"

Buffer sizes

Key exchange vs signatures

evil public key

misc