LoupVaillant / Monocypher

An easy to use, easy to deploy crypto library
https://monocypher.org
Other
580 stars 80 forks source link

Printing out the mac reveals the plain text when compiled with gcc (HIGHLY SEVERE) #273

Closed quadroli closed 2 months ago

quadroli commented 2 months ago

All the code:

#include <string.h>
#include "monocypher.h"

#define CIPHER_SIZE 40

int main(void){
  uint8_t my_message[] = "I am a very cool boy";
  size_t text_size = strlen((char *)my_message);
  uint8_t cipher_buffer[CIPHER_SIZE];
  uint8_t mac[16];
  uint8_t key[] = {0x60, 0x3d, 0xeb, 0x10, 0x15, 0xca, 0x71, 0xbe, 
          0x2b, 0x73, 0xae, 0xf0, 0x85, 0x7d, 0x77, 0x81,
                  0x1f, 0x35, 0x2c, 0x07, 0x3b, 0x61, 0x08, 0xd7, 
          0x2d, 0x98, 0x10, 0xa3, 0x09, 0x14, 0xdf, 0xf4};

  uint8_t nonce[]  = {0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, 
             0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, 
             0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, 0xa8};

  crypto_aead_lock(cipher_buffer, mac, key, nonce, NULL, 0, my_message, text_size);
  printf("My message encrypted: %s\n my mac %s\n of size %lu\n", cipher_buffer, mac, strlen((char *) mac));
  return 0;
}

Using gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) PS: with clang 16, this does not happen

quadroli commented 2 months ago

It seems to be dependent on the length of the string because with uint8_t my_message[] = "I am a very"; this does not happen

wamserma commented 2 months ago

This is not related to Monocypher, but a classic example of CWE-170: Improper Null Termination leading to CWE-126: Buffer Over-read.

The printf call given uses %s to insert the MAC, but %s expects a null-terminated string. In other words, it will read characters from the mac array until it encounters a zero byte. If none is present in the array, characters from beyond the array will be taken, from whatever comes next on the stack.

A short example to showcase this (without using monocypher):


#include <stdint.h>
#include <stdio.h>

// Demo for CWE-170 (https://cwe.mitre.org/data/definitions/170.html)

int main(void){
  uint8_t my_message[] = "I am a very cool boy";  // the string is implicitly null-terminated
  uint8_t mac[4] = {'a', 'b', 'c', 'd'}; 

  printf("My mac %s\n", mac);

  // add a null in the middle of the string and print again
  my_message[6] = 0x00;
  printf("My mac %s\n", mac);

  return 0;
}
quadroli commented 2 months ago

Thank you for shedding more light on this, I did not know. But I had a hunch it may be compiler dependent.

Given the severity of the repurcursions of this in such a setup, as illustrated, what would be an adequate way to mitigate the problem ? I can only imagine ensuring both the mac and cipher text are null terminated.

Also, out of curiosity, how comes clang isn't affected ? Have they given this issue more thought ? Or could it be by fluke ?

wamserma commented 2 months ago

With the simplified example above I get consistent behaviour for GCC 13 and Clang 16. (I did not take care of any flags.)

You can mitigate this somewhat by ensuring all buffers you write to (mac, cipher_buffer) are at least one byte longer than the expected amount of data and that the last byte of the buffer is initialized to 0.

Better, thou, to fix the root cause: confusing uint8_t arrays with Strings. The code above will also happily output non-printable characters from the encrypted message or the MAC. It will also prematurely end output of the encrypted message or the MAC if they contain a null byte.

Again, all this is in no way related to Monocypher, just C being C. I think this issue can be marked as resolved.

quadroli commented 2 months ago

Fair enough, I'm satisfied with the answer as well, however I would like to leave it open for a bit to hear what comments the maintainers may have on the matter.

LoupVaillant commented 2 months ago

Hi, thank you @quadroli for taking the time to fill out the report, and @wamserma for answering while I'm busy.

So, I can confirm the reported problem here is on the C language and the incorrect use of Monocypher's API. The short of it is, C is unsafe, and using pretty much any C API, including Monocypher, is unsafe too. Worse, there's nothing we can do to fix it.

My philosophy with Monocypher is to provide the simplest API that I can, then trust the programmer with it. Most functions for instance return void, because they are guaranteed to work with correct input. It communicates to the user that they do not need to check errors that will never happen, but the flip side is that the onus is on them to use the function correctly.

We could try another philosophy where Monocypher tries to catch programmer errors like using a null pointer instead of a real one. I gave up on this approach, because there are 2 classes of errors a C API can never catch: providing an invalid non null pointer (either one outside the bounds of a buffer, or one that has been freed, or an uninitialised pointer); and providing a pointer to a buffer that is too small (thus causing my functions to overrun said buffer). Merely catching null pointers, which would crash the program in most systems anyway, didn't seem worth the cost of incentivise users to uselessly check return values.

Now your error falls into a third category: reading past a buffer Monocypher filled, using something other than Monocypher, namely printf(3). This one is definitely not on me. :-) Though I can see why you made the mistake: you may have expected, perhaps coming from a more dynamic language, that the MAC would be encoded in a hexadecimal string or similar. It's not, Monocypher works in binary data only. If you want hexadecimal values you'll have to do the conversion yourself. That's what I do in my test suite.

One last thing: the reason different compilers behave differently is because this is Undefined Behaviour. That is, behaviour for which the standard imposes no requirements, so anything could happen, including leaving a critical Remote Code Execution vulnerability. In practice this mostly mean that different compilers, or different compiler settings, will give you various incomprehensibly different results whenever your program hits UB.