Open froydnj opened 4 years ago
possible in theory
https://docs.rs/digest/0.9.0/digest/trait.Update.html
This method can be called repeatedly, e.g. for processing streaming messages.
sample code https://rust-lang-nursery.github.io/rust-cookbook/cryptography/hashing.html
streaming vs stacking, pipes vs files ... https://thenewstack.io/the-big-data-debate-batch-processing-vs-streaming-processing/ https://stackoverflow.com/questions/1512933/when-should-i-use-gccs-pipe-option
in practice, this only makes sense for large files, say 100 MByte and more
small files:
# 1 MByte
time dd if=/dev/urandom bs=1024 count=$((1024 * 1)) status=none | sha256sum - >/dev/null
real 0m0.015s
# 10 MByte
time dd if=/dev/urandom bs=1024 count=$((1024 * 10)) status=none | sha256sum - >/dev/null
real 0m0.100s
# 100 MByte
time dd if=/dev/urandom bs=1024 count=$((1024 * 100)) status=none | sha256sum - >/dev/null
real 0m0.900s
We currently buffer up the preprocessor output, do a bunch of other stuff, and only then do we run the output through our hash function. I don't know if we can actually run the hash function on the output as we collect it, but if we can, it might be somewhat faster because the data is already in cache somewhere (?).