Closed raychu86 closed 3 months ago
@ghostant-1017 Could you confirm that this addresses your finding - https://github.com/AleoHQ/snarkOS/issues/3226?
@ghostant-1017 Could you confirm that this addresses your finding - #3226?
Of course! I will test it out.
Hey @raychu86 , This PR indeed addresses the issue #3226.
However, during my review process, I identified another potential DOS attack vector. It only requires a minor modification on top of the existing changes, I haven't conducted practical tests yet.
The main logic is as follows:
update_dag
function, we insert the certificate into the subdag.BatchCertificate
can contain a maximum of 200 signatures. This means that attackers can continuously send invalid Blocks(different round far in the future), causing honest validator nodes to insert meaningless BatchCertificates
into memory. Eventually, this could lead to a memory overflow.I can probably confirm the effectiveness of this attack. Please recheck whether this attack is valid.I won't submit another report for this, I hope you consider this potential DOS attack when assessing the severity of this report.
Thanks for flagging this @ghostant-1017!
I think the attack described may be possible, there's one thing to consider: the sync logic only takes blocks from BlockResponse
messages/events. I believe we check (when receiving a BlockResponse
) that we requested the specific block, and then clear this request from the cache (meaning a future BlockResponse
for the same block will not be allowed in).
If my understanding is correct, this suggests we could have an attacker send this block once, but then it no longer works.
Perhaps one thing we could do is have sync_storage_with_block
check at the beginning that:
// Return early if this block has already been processed.
if self.ledger.contains_block_height(block.height()) {
return Ok(());
}
Hey, @howardwu
Maybe the protection is not enough.
Assume the latest_height = 10000
, in this situation, the attacker only generate block at height=10001
and broadcast.
I think it's possible to generate blocks at height=10002
, height=10003
and even higher blocks. And the invalid blocks will always pass the height-contain check.
Thanks for the feedback. Wouldn't it require at least 2f+1
private keys to be compromised in order to create a consecutive exploit? We reference the committee_lookback
to ensure the next subdag is well-formed. I suppose the attacker would need to have compromised a substantial set of the existing validators in order to pull this off. Let me know if you see another angle.
It seems that the self.update_dag
will be called before quorum check.
self.update_dag
will be called at: https://github.com/AleoHQ/snarkOS/pull/3232/files#diff-f82e0c9ecc0d4ff1e73050db6da1dc5909f4f8c5afc98d3effd254a440f079e3L355
quorum_check will be called at: https://github.com/AleoHQ/snarkOS/pull/3232/files#diff-f82e0c9ecc0d4ff1e73050db6da1dc5909f4f8c5afc98d3effd254a440f079e3R425
@howardwu Correct me if I'm wrong.
Yes, the idea is that the check in update_dag and subsequently here both enforce availability or quorum threshold. So wouldn't the attacker need to compromise at least f private keys from the committee to allow for the node to sync the block?
(Sorry if I missed the point)
I think Ghostant is saying we are calling update_dag
before performing any quorum/availability checks. So if malicious nodes send a faulty block with garbage certificates, we add it to storage before performing any quorum checks.
I believe our mitigation for this is in the BlockSync::construct_request
where we handle the dishonest case.
@ghostant-1017 let us know if our understanding of your concern is correct.
@raychu86 Yes, you get me, that's what I mean.
So you mean the BlockSync::construct_request
will handle this case? I will take a look into it.
@raychu86 @howardwu I don't think BlockRequest::construct_request is secure in this scenario. In the honest check within construct_request, the attacker is the only peer with a height greater than all other nodes, it will pass all checks.
Do you agree? I'm not entirely certain about my idea without a PoC. I'd be willing to attempt a PoC to validate it if necessary.
@ghostant-1017 I agree that this could be an issue, and we should have some more mitigating checks to prevent this. However this attack existed prior to this PR, so I'm of the opinion we try to mitigate it in a separate PR after this is merged.
I don't believe it is quite as straightforward to address in this PR, because we need to perform update_dag
and sync_certificate_with_block
in order to even perform the availability threshold checks.
In fact, I believe that this is a more fundamental issue with receiving blocks, as this attack using malicious blocks could be done to clients and validators performing the non-BFT sync. Bitcoin has this issue as well, but i believe they get around this by doing chain-reorgs and rollbacks.
Couple ideas for mitigation:
X
other sync nodes has the block before requesting.Let us know if you agree or have other insights.
@raychu86
Yes! Our views have aligned now.
One more suggestion: Maybe we can check the certificate's round is not too far when update_dag
?
And I'm trying finding a way to simplify the syncing process. I'm not quite sure, but I will share with you if I'm ready.
Motivation
This PR updates the final stage of validator syncing by adding an additional quorum check. Instead of blindly adding the block to ledger, we now check that each block's leader certificate reached quorum based on the certificates in the subsequent block.
This solves 2 problems:
Naive validator sync
X
) without performing any quorum checks.X
through the standard non-sync procedure.X
.The change now skips adding the latest block because we can't guarantee the block meets the quorum. So we will process the block through consensus instead of the sync process.
Malicious validator sync
The malicious sync case is defined in a hackerOne finding: https://github.com/AleoHQ/snarkOS/issues/3226
The new quorum checks should ensure that we are not processing blocks blindly.