Please fix Incorrect curr_pos Updates and Length Handling in ChaCha20_Poly1305::aead_encrypt for Accurate MAC Data Construction

Detailed Analysis:

Incorrect Updating of curr_pos in MAC Data Construction

In the ChaCha20_Poly1305::aead_encrypt function, you're constructing the data to be authenticated (MAC data) for Poly1305. The curr_pos variable is used to track the current position in the mac_data buffer. However, there are issues with how curr_pos is being updated, leading to incorrect offsets and possibly including uninitialized data in the MAC computation. Code Snippet:

cpp

size_t curr_pos = 0; memcpy(mac_data, AAD, AAD_len); memset(mac_data + (curr_pos += AAD_len), 0x00, padding1); memcpy(mac_data + (curr_pos += padding1), outputCipher, textLen); memset(mac_data + (curr_pos += textLen), 0x00, padding2); memcpy(mac_data + (curr_pos += padding2), &AAD_len, 8); memcpy(mac_data + (curr_pos += 8), &textLen, 8);

Issues Identified:

Double Incrementing curr_pos:
    In each line, curr_pos is being incremented and then used as the starting point for the next operation. However, the way you're incrementing curr_pos leads to it being incremented before the operation instead of after, causing it to skip positions.
    For example, curr_pos += AAD_len increments curr_pos by AAD_len, but then you increment it again in the next line with curr_pos += padding1, effectively skipping over data.

Incorrect Positioning:
    The incorrect updates to curr_pos can cause the lengths (AAD_len and textLen) to be copied into the wrong positions in the mac_data buffer.
    This may result in the MAC data being constructed differently each time, especially if there's uninitialized or random data in the unused buffer space.

Incorrect Handling of Length Fields (AAD_len and textLen)

Size of size_t: The size_t type can vary in size between platforms (e.g., 4 bytes on 32-bit systems and 8 bytes on 64-bit systems). When you copy AAD_len and textLen into the mac_data buffer using memcpy, you're not guaranteeing that exactly 8 bytes are being copied, and the extra bytes may contain garbage values.

Endianness: The lengths need to be represented as 64-bit little-endian unsigned integers. If your platform uses big-endian byte order or if you're not correctly formatting the lengths, this can lead to inconsistencies.

Solution:

Correct the Updating of curr_pos

Modify the way curr_pos is updated to ensure that it accurately reflects the current position in the mac_data buffer after each operation. Rewritten Code:

cpp

size_t curr_pos = 0;

// Copy AAD memcpy(mac_data + curr_pos, AAD, AAD_len); curr_pos += AAD_len;

// Add padding after AAD if (padding1 > 0) { memset(mac_data + curr_pos, 0x00, padding1); curr_pos += padding1; }

// Copy ciphertext memcpy(mac_data + curr_pos, outputCipher, textLen); curr_pos += textLen;

// Add padding after ciphertext if (padding2 > 0) { memset(mac_data + curr_pos, 0x00, padding2); curr_pos += padding2; }

// Copy AAD_len as a 64-bit little-endian integer uint64_t aad_len_le = (uint64_t)AAD_len; memcpy(mac_data + curr_pos, &aad_len_le, 8); curr_pos += 8;

// Copy textLen as a 64-bit little-endian integer uint64_t text_len_le = (uint64_t)textLen; memcpy(mac_data + curr_pos, &text_len_le, 8); curr_pos += 8;

@manoirx It'll be better if you show a simple test here showing the current wrong output result of the operation/code/function you're talking about and what should be the correct output result and give a minimal reproducible code of that test (so that others can also test it much easier).

You can always make your own pull requests for this fix if you like, any contribution is welcome. And if you do so please add the test described above.

as for the endianess, big endiean is not supported as I explicitly described in the readme file.

@manoirx Upon examining your code, I found that while the solution you provided is not incorrect, but it is unnecessary because there is nothing wrong with the original code. All operations within parentheses are evaluated first before being applied to the outer expression.

TLDR: If you compare the original code and your code you will see that it actually does the same thing but yours is unoptimized due to conditions (if-blocks).

Double Incrementing `curr_pos`:

You mentioned that curr_pos is being incremented twice, leading to skipped positions. However, this is not the case. Each line increments curr_pos once and then uses the updated value as the starting point for the next operation. For example:

curr_pos += AAD_len

increments curr_pos by AAD_len, and then the next line uses this updated value. The increments do not skip over data.

Incorrect Positioning:

You also pointed out that incorrect updates to curr_pos could result in copying the lengths (AAD_len and textLen) into the wrong positions in the mac_data buffer. However, there is no such issue in the original code. The operations are performed in sequence and do not misplace any data.

Padding and Optimization:

The code you provided essentially does the same thing as the original, but it is less optimized due to the unnecessary if blocks you've added. These conditional checks are not required because memset is defined to set a block of memory to a value for a given size n. When n = 0 (i.e., padding size 0), memset performs a no-op (no operation). Thus, there's no need to check if n > 0 before calling memset. You can refer to section 7.21.1 of the C99 standard or section 7.24.1 of the C11 standard for more details.

As for your other issues

In the readme file it was explicitly sated that this library only support the following:

x86_64 or aarch64 architecture

(only 64 bit computer)

little endian system

(only little endian systems)

The others are not supported (yet). Although any contributions to support 32-bit systems and big endian systems are welcome.

Line By Line Comparison:

Let us assume the following value:

// AAD_len = 3
// padding1 = 0
// textLen = 9
// padding2 = 1

Original Code

reminder: All operations within parentheses are evaluated first before being applied to the outer expression

The value of curr_pos on each line is shown in the right side of the line of code as comment.

size_t curr_pos = 0;

memcpy(mac_data, AAD, AAD_len);                                   // curr_pos = 0
memset(mac_data + (curr_pos += AAD_len), 0x00, padding1);         // curr_pos = 3 - this line is a no-op since padding1 is 0
memcpy(mac_data + (curr_pos += padding1), outputCipher, textLen); // curr_pos = 3
memset(mac_data + (curr_pos += textLen), 0x00, padding2);         // curr_pos = 12
memcpy(mac_data + (curr_pos += padding2), &AAD_len, 8);           // curr_pos = 13
memcpy(mac_data + (curr_pos += 8), &textLen, 8);                  // curr_pos = 21

Your Code

size_t curr_pos = 0;

// Copy AAD
memcpy(mac_data + curr_pos, AAD, AAD_len);          // curr_pos = 0
curr_pos += AAD_len;                                // curr_pos = 3

// Add padding after AAD
if (padding1 > 0) {                                 // block skipped due to padding1 = 0
        memset(mac_data + curr_pos, 0x00, padding1);       
        curr_pos += padding1;
}

// Copy ciphertext
memcpy(mac_data + curr_pos, outputCipher, textLen); // curr_pos = 3
curr_pos += textLen;                                // curr_pos = 12

// Add padding after ciphertext
if (padding2 > 0) {
        memset(mac_data + curr_pos, 0x00, padding2);        // curr_pos = 12
        curr_pos += padding2;                               // curr_pos = 13
}

// Copy AAD_len as a 64-bit little-endian integer
uint64_t aad_len_le = (uint64_t)AAD_len;
memcpy(mac_data + curr_pos, &aad_len_le, 8);        // curr_pos = 13
curr_pos += 8;                                      // curr_pos = 21

// Copy textLen as a 64-bit little-endian integer
uint64_t text_len_le = (uint64_t)textLen;
memcpy(mac_data + curr_pos, &text_len_le, 8);       // curr_pos = 21
curr_pos += 8;                                      // add is not needed here

As you can see in this comparison in each line of memsets and memcpys the value of the curr_pos is just the same down the line.

mrdcvlsc / ChaCha20-Poly1305