Why we need a merkle root?

lurais commented 1 year ago

What if we just collet the confirmed poh ? And we can be sure of our transaction is on chain when we finish check the poh?

lurais commented 1 year ago

And why we need the da as a tiny node?

anoushk1234 commented 1 year ago

@lurais I encourage you to read our whitepaper https://www.tinydancer.io/whitepaper

lurais commented 1 year ago

@anoushk1234 I read it , and I am very curious that why should use Data Availability Sampling , just use merkle root is enough to confirm the state of the transaction ?

anoushk1234 commented 1 year ago

@anoushk1234 I read it , and I am very curious that why should use Data Availability Sampling , just use merkle root is enough to confirm the state of the transaction ?

There is currently no merkle root for the block state

lurais commented 1 year ago

@anoushk1234 Thanks, I mean if we have merkle root for the block state, Is the Data Availability Sampling necessary?

dubbelosix commented 1 year ago

@lurais sampling is necessary to enable full nodes to raise fraud proofs. a full node can only raise a fraud proof if it first has the complete block and verifies it and then sees that its fraudulent. even if you have a full state commitment, without DAS a light client cannot be sure that a corrupted supermajority have withheld data from the minority

lurais commented 1 year ago

@dubbelosix you mean that a honest node can provide the proof to figure out the cheating of the other nodes? And the others provided a fake proof while they don't have the real block data? how can a light node find a honest node?

dubbelosix commented 1 year ago

@lurais light clients form a p2p network on their own as well as with the full nodes. this is necessary for propagating DAS failures and fraud proofs as well as for propagating the samples to the eclipsed full node (since a corrupt supermajority can always isolate a full node and not publish the block to it). if a full node raises a fraud proof and one light client gets it, it gets gossiped to other light clients as well within the dispute time delay (or challenge period).

in a fraud proof system, any client waits for the challenge period to finalize a block. in this case, the only thing the malicious super majority does is that they sign a block containing an invalid state transition and propagate that block to other malicious nodes (but not to the honest nodes). light clients will perform DAS, and if the DAS check fails, they will not finalize the block, so the malicious nodes have no choice but to give them the samples. now the light client p2p network disseminates the samples and ensures that the honest nodes get enough to reconstruct the block. now the honest full node can verify the block and raise the fraud proof. if the fraud proof reaches the light clients within the challenge period, they will not finalize the invalid block, and will halt

lurais commented 1 year ago

@dubbelosix the light client just disseminates the samples to all the network to ensure that a honest full node can reconstruct the block? So, the light client wants to confirm a transaction, it request merkle root proofs to confirm that the transaction is contained in a block. And then use DAS to ensure that the block data is contained as the honest node has provided the same info ?

lurais commented 1 year ago

What's more, the sentence "Figure 2: If we want to validate S15 we can use proof {S11,S4,S3} which should compute to S1 if the shred is valid" seems not fit the Figure 2?

dubbelosix commented 1 year ago

@lurais

the light client just disseminates the samples to all the network to ensure that a honest full node can reconstruct the block

yes. full nodes send samples to light clients. upon successful DAS, a light client doesn't do anything. But if one of the samples it requested isn't sent, then it propagates that message to its peers "I did not get a response for sample 12 from block 94516" in which case other light clients will also request that specific sample (12 in this case)

So, the light client wants to confirm a transaction, it request merkle root proofs to confirm that the transaction is contained in a block

yes. a full node provides the merkle proof to the root which the light client can verify.

And then use DAS to ensure that the block data is contained as the honest node has provided the same info

no. DAS is a precondition for this. a light client first gets the header containing the merkle roots. once it does that it verifies consensus on the root. if consensus check passes, it verifies DAS. once DAS passes, it waits for the challenge period to finalize the block. transaction inclusion is only checked against roots that already satisfy consensus + DAS

while its waiting, it also participates in the p2p network and responds to requests from full and other light nodes (this part is critical, because this is what ensures that honest minorities have the block in case dishonest majorities are withholding from them.

the purpose of DAS is simply to ensure that honest full nodes have all the data they need in order to raise a fraud proof. If any light client's sampling fails, then it has good reason to suspect that data is being with-held. without DAS, you have no way of knowing if all honest nodes have the data and are capable of reconstructing the block or not

lurais commented 1 year ago

purpose of DAS is simply to ensure that honest full nodes have all the data they need in order to raise a @dubbelosix How long is the challenge period? Is that the same with the optimistic rollups? And is there a really full node in solana for all the transaction data is too large to store in a single node? And all light client shall consensus on the check result that's why the light client shall respond to requests from full and other light nodes, is that right?

dubbelosix commented 1 year ago

How long is the challenge period? Is that the same with the optimistic rollups?

no its subtly different compared to rollups. for light clients, its actually up to the user to determine how long they want to wait for finalizing. for a 1$ transaction they might decide to wait 1 min, but for a 100k$ txn they might want to wait for a day or so as well... it can be configurable because there's no global state settlement unlike ORs

And is there a really full node in solana for all the transaction data is too large to store in a single node

full nodes always store all the data, so nothing really changes for them. its too much data for light clients to store, hence sampling (+ preventing the eclipse attack). Outside of Solana, a DA layer has the ability to scale horizontally as more nodes are added - this cannot happen if every node stores all the data. hence celestia, eigenDA etc have nodes in the DA layer only storing samples. but for a full fledged layer 1 to support light clients, sampling is useful if you require fraud proofs before finalizing blocks.

long story short- it's not the size of the blocks but rather not wanting light clients to download full blocks. this is the reason DAS is needed

And all light client shall consensus on the check result

this is a more complicated question. for solana, and what tinydancer wants to do, there is no consensus. they just share information so that anything shady (fraud, data withholding) is propagated throughout the network like an alarm

dubbelosix commented 1 year ago

@lurais would recommend reading this paper https://arxiv.org/abs/1809.09044 whatever tinydancer is doing is more or less a subset of the mechanism described in the paper (and other changes to make it more suitable for solana)

lurais commented 1 year ago

determine how long they want to wait for finalizing

@dubbelosix Many users can not be very sure about how long they should wait for finalizing, so ,shall we use zero knowledge proof instead of the challenge period?

dubbelosix commented 1 year ago

Many users can not be very sure about how long they should wait for finalizing, so ,shall we use zero knowledge proof instead of the challenge period?

yeah, you're right. thats a very good point for UX - it'll be complicated if a user has to choose, but certain defaults can be set to make it more seamless. definitely something to consider.

validity proofs would be ideal, but it's pretty complicated and even the closest solutions are still experimental / being tested. even with bpf -> risc0, there are native transactions (system, vote etc) that need to be proven. It's a complete project by itself :) fraud proving is chosen because it's easier to implement in the shorter term. in the longer term, we would need something like the zkEVM for solana (like a zkSVM). zkEVM has been worked on for a long time now, but even that's still not fully feature complete

but it would def be awesome if someone was working on it or trying it out

tinydancer-io / tinydancer

Why we need a merkle root? #19