lukechampine / blake3

An AVX-512 accelerated implementation of the BLAKE3 cryptographic hash function
MIT License
357 stars 23 forks source link

How to compute the Sum512 of a stream of data without fixed known length #20

Open gianalbertochini opened 5 months ago

gianalbertochini commented 5 months ago

Hello,

I’m curious to know if it’s possible to compute the hash of a data stream without knowing its length in advance, and to do this without storing the entire data in RAM. Ideally, the hash should be updated incrementally.

As a beginner, this might seem like a simple question. I understand that to compute the hash, the data needs to be divided into chunks of 1024 bytes.

To put it in simpler terms, I want to write a Hash class that has a method void HashByte(*byte). This method would take an arbitrary number of bytes as input and maintain a “partial hash” in memory, which is updated incrementally every time N bytes arrive from the stream.

Another method, byte[64] Close(void), would return the 512-bit hash as an array of 64 bytes, representing the entire received stream.

for example:

hash = New Hash() hash.HashBytes(byte[] "This is the first array of byte.") hash.HashBytes(byte[] " <A VERY LONG STRING>.") // Here can be added even few MiB of data hash.HashBytes(byte[] "") //Nothing is added hash.HashBytes(byte[] " This is the second") hash.HashBytes(byte[] ".") // Just 1 byte is added byte[64] result0 = hash.close()

byte[64] result1 = Sum512(byte[] "This is the first array of byte. <A VERY LONG STRING>. This is the second.")

result0 should be equal to result1

Is it possible and how can I do?

Many thanks

lukechampine commented 5 months ago

sure, here's how to do that in Go:

h := blake3.New(64, nil) // 512-bit output, no key
h.Write([]byte("This is the first array of byte."))
h.Write([]byte("<A VERY LONG STRING>"))
result := h.Sum(nil)

In Go, we typically use the io.Reader and io.Writer interfaces when working with lots of data. For example, if your data is stored in a file, you could do this:

f, _ := os.Open("path/to/file")
h := blake3.New(64, nil)
h.Write([]byte("This is the first array of byte."))
io.Copy(h, f) // stream the file contents into the hash
result := h.Sum(nil)

This works because io.Copy streams data from an io.Reader (in this case, f, the file) to an io.Writer (in this case, h, the hash).

Hope this helps!