Closed manoirx closed 3 weeks ago
@manoirx It'll be better if you show a simple test here showing the current wrong output result of the operation/code/function you're talking about and what should be the correct output result and give a minimal reproducible code of that test (so that others can also test it much easier).
You can always make your own pull requests for this fix if you like, any contribution is welcome. And if you do so please add the test described above.
as for the endianess, big endiean is not supported as I explicitly described in the readme file.
@manoirx Upon examining your code, I found that while the solution you provided is not incorrect, but it is unnecessary because there is nothing wrong with the original code. All operations within parentheses are evaluated first before being applied to the outer expression.
TLDR: If you compare the original code and your code you will see that it actually does the same thing but yours is unoptimized due to conditions (if-blocks).
curr_pos
:You mentioned that curr_pos
is being incremented twice, leading to skipped positions. However, this is not the case. Each line increments curr_pos
once and then uses the updated value as the starting point for the next operation. For example:
curr_pos += AAD_len
increments curr_pos
by AAD_len
, and then the next line uses this updated value. The increments do not skip over data.
You also pointed out that incorrect updates to curr_pos
could result in copying the lengths (AAD_len
and textLen
) into the wrong positions in the mac_data
buffer. However, there is no such issue in the original code. The operations are performed in sequence and do not misplace any data.
The code you provided essentially does the same thing as the original, but it is less optimized due to the unnecessary if
blocks you've added. These conditional checks are not required because memset
is defined to set a block of memory to a value for a given size n
. When n = 0
(i.e., padding size 0), memset
performs a no-op (no operation). Thus, there's no need to check if n > 0
before calling memset
. You can refer to section 7.21.1 of the C99 standard or section 7.24.1 of the C11 standard for more details.
In the readme file it was explicitly sated that this library only support the following:
x86_64 or aarch64 architecture
(only 64 bit computer)
little endian system
(only little endian systems)
The others are not supported (yet). Although any contributions to support 32-bit systems and big endian systems are welcome.
Let us assume the following value:
// AAD_len = 3
// padding1 = 0
// textLen = 9
// padding2 = 1
Original Code
reminder: All operations within parentheses are evaluated first before being applied to the outer expression
The value of curr_pos
on each line is shown in the right side of the line of code as comment.
size_t curr_pos = 0;
memcpy(mac_data, AAD, AAD_len); // curr_pos = 0
memset(mac_data + (curr_pos += AAD_len), 0x00, padding1); // curr_pos = 3 - this line is a no-op since padding1 is 0
memcpy(mac_data + (curr_pos += padding1), outputCipher, textLen); // curr_pos = 3
memset(mac_data + (curr_pos += textLen), 0x00, padding2); // curr_pos = 12
memcpy(mac_data + (curr_pos += padding2), &AAD_len, 8); // curr_pos = 13
memcpy(mac_data + (curr_pos += 8), &textLen, 8); // curr_pos = 21
Your Code
size_t curr_pos = 0;
// Copy AAD
memcpy(mac_data + curr_pos, AAD, AAD_len); // curr_pos = 0
curr_pos += AAD_len; // curr_pos = 3
// Add padding after AAD
if (padding1 > 0) { // block skipped due to padding1 = 0
memset(mac_data + curr_pos, 0x00, padding1);
curr_pos += padding1;
}
// Copy ciphertext
memcpy(mac_data + curr_pos, outputCipher, textLen); // curr_pos = 3
curr_pos += textLen; // curr_pos = 12
// Add padding after ciphertext
if (padding2 > 0) {
memset(mac_data + curr_pos, 0x00, padding2); // curr_pos = 12
curr_pos += padding2; // curr_pos = 13
}
// Copy AAD_len as a 64-bit little-endian integer
uint64_t aad_len_le = (uint64_t)AAD_len;
memcpy(mac_data + curr_pos, &aad_len_le, 8); // curr_pos = 13
curr_pos += 8; // curr_pos = 21
// Copy textLen as a 64-bit little-endian integer
uint64_t text_len_le = (uint64_t)textLen;
memcpy(mac_data + curr_pos, &text_len_le, 8); // curr_pos = 21
curr_pos += 8; // add is not needed here
As you can see in this comparison in each line of memset
s and memcpy
s the value of the curr_pos
is just the same down the line.
Detailed Analysis:
In the ChaCha20_Poly1305::aead_encrypt function, you're constructing the data to be authenticated (MAC data) for Poly1305. The curr_pos variable is used to track the current position in the mac_data buffer. However, there are issues with how curr_pos is being updated, leading to incorrect offsets and possibly including uninitialized data in the MAC computation. Code Snippet:
cpp
size_t curr_pos = 0; memcpy(mac_data, AAD, AAD_len); memset(mac_data + (curr_pos += AAD_len), 0x00, padding1); memcpy(mac_data + (curr_pos += padding1), outputCipher, textLen); memset(mac_data + (curr_pos += textLen), 0x00, padding2); memcpy(mac_data + (curr_pos += padding2), &AAD_len, 8); memcpy(mac_data + (curr_pos += 8), &textLen, 8);
Issues Identified:
Incorrect Handling of Length Fields (AAD_len and textLen)
Size of size_t: The size_t type can vary in size between platforms (e.g., 4 bytes on 32-bit systems and 8 bytes on 64-bit systems). When you copy AAD_len and textLen into the mac_data buffer using memcpy, you're not guaranteeing that exactly 8 bytes are being copied, and the extra bytes may contain garbage values.
Endianness: The lengths need to be represented as 64-bit little-endian unsigned integers. If your platform uses big-endian byte order or if you're not correctly formatting the lengths, this can lead to inconsistencies.
Solution:
Modify the way curr_pos is updated to ensure that it accurately reflects the current position in the mac_data buffer after each operation. Rewritten Code:
cpp
size_t curr_pos = 0;
// Copy AAD memcpy(mac_data + curr_pos, AAD, AAD_len); curr_pos += AAD_len;
// Add padding after AAD if (padding1 > 0) { memset(mac_data + curr_pos, 0x00, padding1); curr_pos += padding1; }
// Copy ciphertext memcpy(mac_data + curr_pos, outputCipher, textLen); curr_pos += textLen;
// Add padding after ciphertext if (padding2 > 0) { memset(mac_data + curr_pos, 0x00, padding2); curr_pos += padding2; }
// Copy AAD_len as a 64-bit little-endian integer uint64_t aad_len_le = (uint64_t)AAD_len; memcpy(mac_data + curr_pos, &aad_len_le, 8); curr_pos += 8;
// Copy textLen as a 64-bit little-endian integer uint64_t text_len_le = (uint64_t)textLen; memcpy(mac_data + curr_pos, &text_len_le, 8); curr_pos += 8;