mozilla / sccache

Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage environments, including various cloud storage options, or alternatively, in local storage.
Apache License 2.0
5.78k stars 545 forks source link

consider running the hash computation on preprocessor output as it's produced #788

Open froydnj opened 4 years ago

froydnj commented 4 years ago

We currently buffer up the preprocessor output, do a bunch of other stuff, and only then do we run the output through our hash function. I don't know if we can actually run the hash function on the output as we collect it, but if we can, it might be somewhat faster because the data is already in cache somewhere (?).

milahu commented 2 years ago

possible in theory

https://docs.rs/digest/0.9.0/digest/trait.Update.html

This method can be called repeatedly, e.g. for processing streaming messages.

sample code https://rust-lang-nursery.github.io/rust-cookbook/cryptography/hashing.html

streaming vs stacking, pipes vs files ... https://thenewstack.io/the-big-data-debate-batch-processing-vs-streaming-processing/ https://stackoverflow.com/questions/1512933/when-should-i-use-gccs-pipe-option

in practice, this only makes sense for large files, say 100 MByte and more

small files:

# 1 MByte
time dd if=/dev/urandom bs=1024 count=$((1024 * 1)) status=none | sha256sum - >/dev/null 
real    0m0.015s

# 10 MByte
time dd if=/dev/urandom bs=1024 count=$((1024 * 10)) status=none | sha256sum - >/dev/null 
real    0m0.100s

# 100 MByte
time dd if=/dev/urandom bs=1024 count=$((1024 * 100)) status=none | sha256sum - >/dev/null 
real    0m0.900s