eddelbuettel / digest

R package to create compact hash digests of R objects
https://eddelbuettel.github.io/digest
111 stars 47 forks source link

AES CBC mode: add support for padding #185

Closed daattali closed 1 year ago

daattali commented 1 year ago

When encrypting or decrypting in CBC mode, the input must have a length of a multiple of 16. This is quite restrictive. Many online tools and other libraries provide an option to pad, such as PKCS. It would be useful if AES provided padding as well.

eddelbuettel commented 1 year ago

Sure. Do you feel like you could propose a PR doing so for AES?

daattali commented 1 year ago

I can probably write that , but I'm a noob when it comes to encryption, I didn't know what AES or padding was until a few hours ago. Are there other padding methods, or is the one I mentioned the main one? Do you want that be done automatically if the input length is not a multiple of 16?

eddelbuettel commented 1 year ago

AES was a contributed. I personally do not use it much.

Don't worry too much about encryption as the digest package goes out of its way stating its aim is hashing.

daattali commented 1 year ago

Actually I won't be adding this, because I'm not even sure the CBC encryption is working correctly.

I'm getting different results with this package vs {openssl} or online tools. All other tools are in agreement, but digest::AES seems to cut the result too short. Example:

msg <- charToRaw("0123456789012345")
key <- charToRaw("ABCDEFGHIJKLMNOPQRSTUV==")
iv <- charToRaw("61e605afd0fd15f3")

print(as.raw(openssl::aes_cbc_encrypt(data = msg, key = key, iv = iv)))

print(digest::AES(mode = "CBC", key = key, IV = iv)$encrypt(msg))

Output:

[1] dc 5c d2 89 5d 4d c2 26 2c 5a 2d 58 42 a0 38 29 08 f0 5e c0 37 c3 27 54 50 a9 48 b9 e9 1d 83 c8
[1] dc 5c d2 89 5d 4d c2 26 2c 5a 2d 58 42 a0 38 29

I tried other messages and the issue still happens. Perhaps I'm not using the function correctly, but I'm just going to use {openssl} even though it was initially my second choice because this package is so light on dependencies

eddelbuettel commented 1 year ago

Whatever works for you. I am fairly certain that what we claim are reference values are in fact reference values. Eg from the start of the manual page and unit test we have (for FIPS 197, the first one covered)

plaintext       <- hextextToRaw("00112233445566778899aabbccddeeff")

aes128key       <- hextextToRaw("000102030405060708090a0b0c0d0e0f")
aes128output    <- hextextToRaw("69c4e0d86a7b0430d8cdb78070b4c55a")

and in the NIST publication https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.197.pdf you see under Section C.1 as example for 'AEX-128' the same plaintext and key on the bottom of page 40 and then as 'rount[10].output' the output vector we compare against. I am pretty sure it is the same for CBC mode so I invite you to check the manual page, the test setup and the corresponding reference.

daattali commented 1 year ago

I do see that this example, which uses "ECB" mode, seems to be correct. But from my tinkering with this function, I can't get "CBC" mode to give me the same answer that I'm expecting, that several other tools are showing. Here's another example where I try to simplify as much as possible by having the key, iv, and text all the same:

text <- "0123456789ABCDEF"
rawtext <- charToRaw(text)
(digest::AES(rawtext, "CBC", rawtext))$encrypt(rawtext)

This package outputs "9d 2c da 90 1b 68 2d 33 59 70 9a 5a b2 41 96 24"

For some reason when I try the same using {openssl} or https://www.devglan.com/online-tools/aes-encryption-decryption or https://www.javainuse.com/aesgenerator I get a vector twice as long: "9d 2c da 90 1b 68 2d 33 59 70 9a 5a b2 41 96 24 21 25 33 3b df 54 01 32 17 9a c0 de 79 e1 83 7a"

From what I can tell this is a bug. But because I don't have any experience with encryption and 0 domain knowledge, I don't want to falsely make an accusation, so I'll assume that perhaps I'm just doing something wrong. I don't have the expertise to dive deep into this and figure out if it's really a bug or not. So I'll just leave it at this.

eddelbuettel commented 1 year ago

If openssl does what you need, you should certainly use it. The AES part was contributed. and it is not something I am deeply familiar with. Maybe the output needs to be longer -- I would not know. Note, though for CBC too use some reference input and output. That passes the bar for me. Anyway....

daattali commented 1 year ago

I will be using openssl, but for completeness, I had to figure it out... https://github.com/eddelbuettel/digest/pull/186