DaGenix / rust-crypto

A (mostly) pure-Rust implementation of various cryptographic algorithms.
Apache License 2.0
1.4k stars 298 forks source link

How to encrypt / decrypt big string using Blowfish ECB? #393

Closed danielrs closed 8 years ago

danielrs commented 8 years ago

I need to encrypt / decrypt a String of unknown size using Blowfish, however, the only functions provided in the module are encrypt_block and decrypt_block, which only work on &[u8] slices of 8 bytes.

I'm guessing there's an interface for using all symmetric block ciphers? I'm having trouble finding examples.

adam-frisby commented 8 years ago

Convert your string to bytes?

https://doc.rust-lang.org/std/string/struct.String.html#method.as_bytes

danielrs commented 8 years ago

@pix64 I apologize if I didn't explain myself correctly. What I mean is that Blowfish's encrypt_block and decrypt_block only work on blocks of 8 bytes. If my String is more than 8 bytes long, then those methods won't work, as they wouldn't be able to encrypt the whole thing.

adam-frisby commented 8 years ago

You have to split your strings bytes into chunks of 8 bytes. Slices have a .chunks(n) iterator which will give you slices of n bytes. Then join the outputs of encrypt_block on each chunk. Since, you have to have a block of 8, you may need to pad the last chunk.

Since you're encrypting a UTF-8 string I would assume you would able to pad with zeros (null character). When you decrypt the cyphertext you will know the String ends at the null terminator.

danielrs commented 8 years ago

Just got encryption working:

//! Encryption and Decryption using Blowfish with ECB mode.

use crypto::blowfish::Blowfish;
use crypto::symmetriccipher::{BlockEncryptor, BlockDecryptor};

pub fn encrypt(key: &String, input: &String) -> String {
    let blowfish = Blowfish::new(key.as_bytes());
    let block_size = <Blowfish as BlockEncryptor>::block_size(&blowfish);

    // Input and output bytes
    let input_len = round_len(input.len(), block_size);
    let mut input = input.as_bytes().to_vec();
    input.resize(input_len, 0);

    let mut output : Vec<u8> = Vec::with_capacity(input_len);
    unsafe { output.set_len(input_len); }

    // Encrypts input and saves it into output
    for (ichunk, mut ochunk) in input.chunks(block_size).zip(output.chunks_mut(block_size)) {
        blowfish.encrypt_block(&ichunk, &mut ochunk);
    }

    // Generates hex representation of output
    let mut hex_output = String::with_capacity(output.len() * 2);
    for i in 0..output.len() {
        hex_output.push_str(&format!("{:02x}", output[i]));
    }

    hex_output
}

/// Rounds the given len so that it contains blocks
/// of the same size.
fn round_len(len: usize, block_size: usize) -> usize {
    let remainder = len % block_size;
    if remainder == 0 {
        len
    }
    else {
        len + block_size - remainder
    }
}

I will try to implement decryption in a while. Just to confirm before closing: rust-crypto doesn't do this kind of encryption already, right? I feel like I might be reinventing the wheel.

adam-frisby commented 8 years ago

I don't have intimate knowledge of rust-crypto but it seems to be focus on crypto primitives. Another library should probably be built on top of it to provide functions to perform common case crypto functions (encrypting Strings, etc)

danielrs commented 8 years ago

Here's the implementation I came out with for anyone interested:

//! Encryption and Decryption using Blowfish with ECB mode.

use std::ffi::OsString;

use crypto::blowfish::Blowfish;
use crypto::symmetriccipher::{BlockEncryptor, BlockDecryptor};

const PADDING_BYTE: u8 = 2;

/// Returns the encrypted input using the given key.
pub fn encrypt(key: &str, input: &str) -> String {
    let bytes = cipher_with(key.as_bytes(), input.as_bytes(), |blowfish, from, mut to| {
        blowfish.encrypt_block(&from, &mut to);
    });

    // Generate hexadecimal representation of `bytes`.
    let mut output = String::with_capacity(bytes.len() * 2);
    for b in bytes {
        output.push_str(&format!("{:02x}", b));
    }
    output
}

/// Returns the decrypted input using the given key.
pub fn decrypt(key: &str, hex_input: &str) -> OsString {
    use std::u8;
    use std::str;
    use std::ffi::OsStr;
    use std::os::unix::ffi::OsStrExt;

    let mut input = Vec::with_capacity(hex_input.len());
    for chunk in hex_input.as_bytes().chunks(2) {
        // We already now that the chunk is utf-8 compilant as it comes
        // from a &str.
        let fragment = unsafe { str::from_utf8_unchecked(chunk) };
        let byte = u8::from_str_radix(fragment, 16).unwrap_or(0);
        input.push(byte);
    }

    let mut bytes = cipher_with(key.as_bytes(), &input, |blowfish, from, mut to| {
        blowfish.decrypt_block(&from, &mut to);
    });
    if let Some(index) = bytes.iter().position(|&b| b == PADDING_BYTE) {
        // Go ahead and ignore all bytes after the null character (\0).
        bytes.truncate(index);
    }

    OsStr::from_bytes(&bytes).to_owned()
}

/// Divides the input in blocks and cyphers using the given closure.
fn cipher_with<F>(key: &[u8], input: &[u8], mut func: F) -> Vec<u8>
    where F: FnMut(Blowfish, &mut [u8], &mut [u8]) {

    let blowfish = Blowfish::new(key);
    let block_size = <Blowfish as BlockEncryptor>::block_size(&blowfish);

    // Input and output bytes
    let input_len = round_len(input.len(), block_size);
    let mut input = input.to_vec();
    input.resize(input_len, PADDING_BYTE);

    let mut output : Vec<u8> = Vec::with_capacity(input_len);
    unsafe { output.set_len(input_len); }

    // Encrypts input and saves it into output
    for (mut ichunk, mut ochunk) in input.chunks_mut(block_size).zip(output.chunks_mut(block_size)) {
        func(blowfish, ichunk, ochunk);
    }

    output
}

/// Rounds the given len so that it contains blocks
/// of the same size.
fn round_len(len: usize, block_size: usize) -> usize {
    let remainder = len % block_size;
    if remainder == 0 {
        len
    }
    else {
        len + block_size - remainder
    }
}

#[cfg(test)]
mod tests {
    use super::{encrypt, decrypt};
    use std::ffi::OsString;

    struct Test {
        key: String,
        plain_text: String,
        cipher_text: String,
    }

    fn get_test_vector() -> Vec<Test> {
        vec![
            Test {
                key: "R=U!LH$O2B#".to_owned(),
                plain_text: "è.<Ú1477631903".to_owned(),
                cipher_text: "4a6b45612b018614c92c50dc73462bbd".to_owned(),
            },
        ]
    }

    #[test]
    fn encrypt_test_vector() {
        for test in get_test_vector() {
            let cipher_text = encrypt(&test.key, &test.plain_text);
            assert_eq!(test.cipher_text, cipher_text);
        }
    }
}
wchargin commented 3 years ago

Passersby thinking of copy-pasting these code snippets should be aware of some safety issues:

    let mut output : Vec<u8> = Vec::with_capacity(input_len);
    unsafe { output.set_len(input_len); }

This violates the safety contract of set_len, which specifies that the elements at old_len..new_len must be initialized. Just because your code is not directly reading them does not make this safe. You may want to read docs for MaybeUninit or @ralfjung’s posts.

        for chunk in hex_input.as_bytes().chunks(2) {
            // We already now that the chunk is utf-8 compilant as it comes
            // from a &str.
            let fragment = unsafe { str::from_utf8_unchecked(chunk) };

This is not accurate, because not all substrings of valid UTF-8 bytes are themselves valid UTF-8. This code could be chopping a code point in half: e.g., on input "aéi", it violates the safety contract of from_utf8_unchecked. This may be safe if hex_input really is only ASCII, as would be the case if it were the output of encrypt(...). But this is an argument to a pub fn, so no such assumptions are safe, and, in any case, presumably the point is to decrypt data given by the user.

        unsafe { str::from_utf8_unchecked(&bytes).to_owned() }

I might be missing something, but I don’t see how this could possibly be safe. We’ve just finished decrypting some arbitrary user-provided ciphertext (an argument to a pub fn) into bytes, so clearly it could contain any arbitrary byte-sequence (of appropriate length), not just valid UTF-8. For example, suppose that hex_input == "2b10455c044c2212" and key == "testtest"; then bytes == vec![0x99, 0x99].

(There are also two correctness bugs with the padding handling, but they don’t cause soundness violations, as far as I can tell.)


The intended moral of this comment is to please be careful with unsafe code, especially when posting it for public consumption, and only introduce unsafe code when you can prove your code correct and demonstrate a measurable performance improvement. There are safe and fast ways to write this code: e.g., it’s easy for the kernel to give you zero-initialized pages, so vec![0; input_len] is probably fast enough. And if you really need to push it faster, you can use tools like MaybeUninit to do so more safely.