Open dvc94ch opened 3 years ago
Interesting use case. So the idea is that you're maintaining an outboard tree in the pre-finalized state (so the nodes are laid out in post-order). But from time to time you want to finalize a snapshot of the tree, which can be used to extract slices from the content?
Yes. Well currently I just clone the encoder, finalize it and write the finalized state into a db. I still haven't implemented saving/restoring the encoder in an unfinalized state so I can survive a restart.
I'm currently working around the problem above by sending signed hashes with the slice. This increases the protocol overhead, so it would be nice if it isn't necessary. It requires at least 32 byte hash, 64 byte signature and a 8 byte stream identifier.
Actually this only works if the one writing to the stream is honest. Weird things could happen if two hashes are signed where one isn't latter than a previous hash. I guess since it's append only I'd need to ensure that the new hash identifies a superset of the old data. But I guess for now I can assume that the data producer doesn't equivocate hash updates. In this scenario we are making sure that we can fetch slices safely from replicas.
well, verifying signatures is very slow, it completely kills throughput, so I'm not very happy with that solution. (after caching already verified hashes it's not that bad)
so basically performant replicated blake streams requires two missing features:
extracted slices should verify correctly with an old "snapshot hash"
save/restore the encoder state: currently I rehash the entire stream on startup which is ok for small streams, maybe something smarter can be done
A status update: The append only authenticated replicated streams top out locally at 300MB/s due to blake3 hashing dominating and sync on localhost at up to 220MB/s [0].
One thing to be aware of for benchmarking is that Bao isn't currently taking advantage of any of the fancy SIMD optimizations inside of the blake3
crate. There's no technical reason it couldn't, but since Bao needs access to the guts of the hash tree, I've currently only exposed a very simple API for doing what it needs to do. I'll probably make the available API richer in the future, and another things that might happen here is that Bao might switch to a default "chunk group size" that's bigger than 1 chunk (1 KiB), which might allow for some API simplifications.
Weird things could happen if two hashes are signed where one isn't later than a previous hash.
This sort of thing is tricky to prevent in general. Say you have a 10 MB file, and I send you two disjoint slices of it. If the file hasn't changed at all in between extracting those two slices, then verifying they come from the same file is trivial: they just need to have the same root hash. But if the file might've been appended to in-between, I don't think there's an easy way to verify that a "later" root hash is "properly later", other than just downloading the entire file and looking at the bits yourself.
I could imagine defining some sort of very custom protocol where you figure out exactly which subtrees have remained the same, and which ones have grown. But this seems like a relatively large amount of complexity for an extremely specific use case. Could you tell me more about the application you're working on? Maybe there's an easier way to get the guarantees you need?
I'm trying to build a generic abstraction that is useful for a large set of p2p applications. The dat (now called hypercore) protocol uses some kind of merkle tree based streams, although I haven't looked into it much as there isn't a good rust implementation. The other approach that has been popular is using content addressed blocks which are linked via hashes. Ipfs and in general blockchains use some form of content addressed blocks. After having dealt with such blocks and ipfs for a few years I think they have quite a few downsides and no upsides.
In particular at actyx we deal with streams of events, which we then turn into blocks using banyan trees so that there is a large amount of branching so we can sync them efficiently. Then these blocks go in a db and the db writes them to a file basically turning them into a stream again and the file system then turns it back into blocks. And this hole process is more complicated than it needs to be and requires things like garbage collection (since the blocks are content addressed and deduplicated) etc. After claiming for a while now I could build something better and faster a coworker said I should code it up during my vaccation.
One reason to use blocks with a maximum size is to prevent dos attacks as someone could send an infinite random stream of data. With bao we can extract slices and fetch them from multiple peers, which was pretty much the only advantage of blocks. (at a block size of say 1MB I don't think there is much practical deduplication of completely unrelated data)
But if the file might've been appended to in-between, I don't think there's an easy way to verify that a "later" root hash is "properly later", other than just downloading the entire file and looking at the bits yourself.
If the file size is known in advance and the originator is trusted, can't appended data be verified like so?
I'm assuming there'd need to be some padding to each additional file added to maintain tree encoding consistency. Not sure if that's handled automatically in this scenario.
Use hydrated digest state to add a new file
Ah if you've seen all the data locally then something like this might work. I think my comments above were assuming that you didn't get to see all the data that was appended, so you can't independently verify the root hash. But it's been a while so it's a little hard to remember.
All data can be proven to exist remotely
I'm not sure you can prove that the remote still has the data, if that's important to you. The remote might also be following a similar de/hydrating strategy and only storing intermediate encoder states. But again, maybe I'm not totally following.
@oconnor663 Sorry to dig up an ancient thread, but I did a little experimentation in the approach I outlined. While a proof of replication of all remote data isn't possible, a probabilistic assumption could be assumed if the verification task were random and distributed over time. Check out my local implementation (I'll add networking when I get the time) https://github.com/FuzzrNet/Forage
One thing is that I hash each file separately, and I'd actually really like to be able to append data to a single file and update its Bao hash, since that would also make it easier to make a remotely replicable append-only log datastore. This, combined with encryption and BCH ECC could make for a very dependable way to secure important data for a very long time.
Currently I have an api that looks like this:
now if the new hash hasn't reached the peer yet and it is requesting the data with the old hash, the updated outboard encoded bao will fail to send a slice that the client can verify. Is this something that could be supported by bao? alternatively I can probably design my protocol a little differently.