RustCrypto / hashes

Collection of cryptographic hash functions written in pure Rust
1.87k stars 251 forks source link

Extend `Digest` trait to support inputting data from a `Read`-able object #22

Closed Wallacoloo closed 7 years ago

Wallacoloo commented 7 years ago

The current Digest trait looks like this:

pub trait Digest: Input + FixedOutput {
    type OutputSize: ArrayLength<u8>;
    type BlockSize: ArrayLength<u8>;
    fn input(&mut self, input: &[u8]);
    fn result(self) -> GenericArray<u8, Self::OutputSize>;
}

A common pattern, at least for me, is to take a hash for an entire file. This can be done something like:

let file = File::open(...).unwrap();
let mut hasher = Sha256::default();

// Read blocks of the file into memory and pass them into the hasher.
let mut buffer: [u8; 512] = Default::default();
loop {
    let bytes_read = file.read(&mut buffer)?;
    hasher.input(&buffer[..bytes_read]);
    if bytes_read != buffer.len() {
        break;
    }
}

let hash = hasher.result();

I think this pattern may be common enough to justify extending the Digest trait with a read_from method, and providing a default implementation that is used across all implementors:

pub trait Digest: Input + FixedOutput {
    fn read_from(&mut self, from: &mut std::io::Read) -> Result<(), std::io::Error> {
        // Read blocks into memory until EOF (or io error), and pass into the `input` function.
    }
}

This would replace the middle 8 lines I wrote in the above example with what in my opinion is a more ergonomic function call, and it generalizes to any type that implements std::io::Read (e.g. sockets or Tcp streams).

Does this seem like a good idea? I'd be happy to implement it if those more invested in RustCrypto than myself approve.

Wallacoloo commented 7 years ago

I guess this method would actually go under the Input trait in the digest crate - not Digest. In any case, I see the Crates.io links are outdated & digest is hosted at https://github.com/RustCrypto/traits now. Should I move the issue there?

newpavlov commented 7 years ago

You are right that such method is nice to have. Initially I thought about implementing Write trait, but I was hesitant to add it, as it would be incompatible with no_std and I didn't want to add std feature to every digest crate. But thanks to your issue it seems I found a good way to do it.

I've added DigestReader trait to the digest crate behind std feature gate. In contrast to your proposal it's an additive feature, so while digest crates do not use std feature your can enable it in Cargo.toml:

[dependencies]
blake2 = "0.5.1"
digest = {version="0.5.1", features=["std"]}

And then use DigestReader in your code:

let mut hash = Blake2b::default();
hash.read_from(&mut file).unwrap();
let result = hash.result();
println!("{:x}\t{}", result, path);

As afterthought I am thinking it's also worth to add a convenience function which will look like this:

pub fn digest_reader<D>(source: &mut io::Read)
    -> io::Result<GenericArray<u8, D::OutputSize>>
    where D: Default + DigestReader + FixedOutput
{
    let mut digest: D = Default::default();
    digest.read_from(source)?;
    Ok(digest.fixed_result())
}

UPD: I've added it in the digest v0.5.2.

Wallacoloo commented 7 years ago

Ok, I'm using the digest_reader function in my own crate now. I don't have any tests for this portion yet, but it's compiling alright!

newpavlov commented 7 years ago

BTW you can write your code shorter without unnecessary copy:

&Some(hash) => {
    let mut file = File::open(path).expect("Failed to open file");
    let result = digest_reader::<Sha256>(&mut file).expect("Failed to read data from file");
    hash == result.as_slice()
}

Also in the next version of digest I will add digest_reader to the trait, so we could use Sha256::digest_reader(..) without turbofish operator. Should've done it initially, but forgot about it for some reason...

Wallacoloo commented 7 years ago

@newpavlov Thanks for the tip. FWIW, this portion of the code is tested now, so the DigestReader trait seems to be working as expected.