Open sachaaaaa opened 5 years ago
Yes, as discussed previously. The title was just to remind us.
Hash is not enough to prove a node holds the whole message. Has to be a range into the data (index to a single byte?).
A salted hash is enough to prove a node holds the whole message.
We might want to survey / reuse / modify distributed database like Apache Cassandra / Scylla for how they maintain data integrity across different nodes.
Amazon DynamoDB and Apache Cassandra use it during the data replication process. These No-SQL distributed databases use Merkle trees to control discrepancies.
https://www.simplilearn.com/tutorials/blockchain-tutorial/merkle-tree-in-blockchain
https://distributeddatastore.blogspot.com/2013/07/cassandra-using-merkle-trees-to-detect.html
See also https://dbdb.io/ https://15721.courses.cs.cmu.edu/spring2023/project2.html
The subtle difference is, most distributed database assume replicate storage nodes are honest, (they might fail but they don't cheat), while in Oxen network we need to assume replicate storage nodes might cheat.
To prevent cheating, we could consider modifing "Merkle Tree" into a "Salted Merkle Tree", where the verifier specifies a token range and generates a random nonce to challenge the testee, the testee needs to append the nonce in all messages (or data partitions) in the specific token range, calculates the merkle tree and returns it to the verifier.
There are some tricky potential race conditions hard to handle though.
Hash is not enough to prove a node holds the whole message. Has to be a range into the data (index to a single byte?).