autonomys / subspace

Subspace Network reference implementation
https://subspace.network
378 stars 243 forks source link

[Critical] Attackers may craft invalid domain bundle/execution receipt with excessive fraud proof size, making it impossible to submit a fraud proof on consensus chain #2425

Open jakoblell opened 9 months ago

jakoblell commented 9 months ago

Issue description

When a bundle/execution receipt with an invalid execution result is constructed, someone will have to generate a fraud proof for it and submit it by dispatching a submit_fraud_proof call on the consensus chain. For the case of an invalid extrinsic execution result, the fraud proof will be of type InvalidStateTransitionProof and this fraud proof will contain one or two storage proofs (InvalidStateTransitionProof.proof and InvalidStateTransitionProof.execution_phase.ApplyExtrinsic.extrinsic_proof).

If the failing extrinsic is doing a lot of storage access, this will lead to large storage proofs, possibly to an extent where the fraud proof exceeds the consensus chain MAX_BLOCK_LENGTH = 5 MiB. If this is the case, honest domain operators may know that the block is invalid but they can never successfully dispatch a submit_fraud_proof for it.

In order to limit the amount of storage changes, the 2D weight system should be used with an appropriate limit for the proof_size weight in order to ensure that it is always possible to submit a fraud proof when needed. As of now the MAXIMUM_BLOCK_WEIGHT is configured to Weight::from_parts(u64::MAX, u64::MAX), so there is effectively no limit at all.

Risk

If an invalid bundle/execution receipt cannot be discarded via a fraud proof due to this, it will eventually be considered final. If the storage root after executing the invalid bundle contains outgoing transfer messages minting SSC out of thin air, these transfers will be accepted by the consensus chain or sibling domains, effectively allowing an attacker to mint arbitrary amounts of SSC.

Mitigation suggestion

Use an appropriate limit for the proof_size weight (with some safety margin to the consensus chain block length) to ensure that it is always possible to submit a fraud proof for an invalid bundle. Extra care needs to be taken to ensure that no calls can lead to a proof size larger than the predicted value from the weight calculation since this may directly lead to a situation where an invalid execution receipt is accepted since nobody can submit a fraud proof for it.

nazar-pc commented 9 months ago

I think there might be an issue for this already and there was some progress done on this, which wasn't quite finished

vedhavyas commented 9 months ago

This issue tracks the same case described here - https://github.com/subspace/subspace/issues/1596

We partially solved it for using frontiers POV size ratio. Corresponding code is here - https://github.com/subspace/subspace/blob/main/domains/runtime/evm/src/lib.rs This ensures on the EVM side the gas limit is split between storage access and compute. Currently set to 1/4 of the gas limit. On the substrate end, since team writes the pallets, we de-prioritzed this until we have add benchmarks to each pallet used and come back if we there would be a scenario where this could happen.

Having said that, I'm curious if there is a scenario you have come up with

jakoblell commented 8 months ago

Can provide an update on the domain block weight limits? As of now we still have MAXIMUM_BLOCK_WEIGHT: Weight = Weight::from_parts(u64::MAX, u64::MAX) and without an actual limit it can't be guaranteed that a fraud proof will fit into the conensus chain block size limit.

vedhavyas commented 8 months ago

@jakoblell We are aware of this and tracked here #2387. We will prioritse this change soon and post a follow-up pr with proper extrinsics limits. Thank you!

dariolina commented 5 months ago

@vedhavyas is the issue not fixed between #2568 and #2651 by setting up bundle weight limits that in turn limit extrinsics?

vedhavyas commented 5 months ago

@jakoblell For substrate based chains, we ensure the total weight of all transactions in a given Bundle are at most BundleWeightLimit. More here - https://github.com/subspace/subspace/blob/main/domains/client/domain-operator/src/domain_bundle_proposer.rs#L238. This should not allow FP storage proofs to blow up since the all the reads are accounted in individual tx_weight.

One thing note here is that if the txn weights are outdated or wrong, this would be a problem since actual execution weight could be completely different. This may not be a problem for Runtimes we provide like EVM or AutoID but poses a bigger issue if and when we allow custom domain runtimes where they could completely give wrong weight for a given transaction. In such case, this issue will come back where honest operator wont be able to submit a FP.

Ideal solution would be to get the actual execution weight for a given transaction and then use it in BundleWeightLimit. But getting such a would require us to execute the txn, collect the actual weight. This would be in-efficient but necessary unfortunately.

Use an appropriate limit for the proof_size weight (with some safety margin to the consensus chain block length) to ensure that it is always possible to submit a fraud proof for an invalid bundle.

This might not be necessary if we get the actual execution weight but simply limiting the proof size will not solve the issue since proof sizes can go over anyway and fraud proof will not submitted due to size limits

jakoblell commented 5 months ago

https://github.com/subspace/subspace/issues/2425#issuecomment-2111948667

@jakoblell For substrate based chains, we ensure the total weight of all transactions in a given Bundle are at most BundleWeightLimit. More here - https://github.com/subspace/subspace/blob/main/domains/client/domain-operator/src/domain_bundle_proposer.rs#L238.

To me it looks like this is just in the domain block/bundle production code used by honest domain operators - but this weight limit isn't actually enforced on chain. A malicious domain operator can trivially patch out this check and submit an invalid and overweight domain block. This can lead to a situation where a fraud proof cannot be submitted due to the size of the required storage proof.

In order to prevent that, the domain runtime should directly reject blocks exceeding a reasonable weight limit (including a proof size weight). Enforcing a block weight limit on a substrate-based chain is typically implemented via frame_system::CheckWeight based on the BlockWeights configuration, which is currently still configured to u64::MAX:

Use an appropriate limit for the proof_size weight (with some safety margin to the consensus chain block length) to ensure that it is always possible to submit a fraud proof for an invalid bundle.

This might not be necessary if we get the actual execution weight but simply limiting the proof size will not solve the issue since proof sizes can go over anyway and fraud proof will not submitted due to size limits

Well there may be some gaps/inaccuracies in the proof size weighting but in theory it should be sufficient to limit the required size for the storage proof required in a fraud proof. In practice there should be some safety factor between the allowable domain block proof size weight and the consensus chain block length limit so that a fraud proof will always fit into the limit.

vedhavyas commented 5 months ago

To me it looks like this is just in the domain block/bundle production code used by honest domain operators - but this weight limit isn't actually enforced on chain. A malicious domain operator can trivially patch out this check and submit an invalid and overweight domain block. This can lead to a situation where a fraud proof cannot be submitted due to the size of the required storage proof.

Correct. This should be handled with this Fraud proof - https://github.com/subspace/subspace/pull/2768/files

In order to prevent that, the domain runtime should directly reject blocks exceeding a reasonable weight limit (including a proof size weight). Enforcing a block weight limit on a substrate-based chain is typically implemented via frame_system::CheckWeight based on the BlockWeights configuration, which is currently still configured to u64::MAX

We have set the Max limit for the Domain block because we have seen ExhaustResources during the Domain block import. Ideally, we want to ensure all the domain transactions in Consensus bundles should be executed on Domain. We are okay if the domain block import is slower but nonetheless executed. In order to avoid the excessive size coming from each extrinsics, we have introduced the POV size for EVM specifically and then we introduced MAX_BUNDLE_WEIGHT on consensus chain. One regression introduced with MAX domain Block limit is max extrinsic weight. We have an issue to fix the max_extrinsic_weight on domains while keeping the MAX domain block weight.

Once this above issue is fixed, there is reduced weight for each extrinsic and operators use that to calculate Domain Bundle weight. If the malicious operator do skip it, fraud proof will come into picture to slash that specific operator.

Let me know if that make sense @jakoblell

vedhavyas commented 5 months ago

@jakoblell I have a PR up here - https://github.com/subspace/subspace/pull/2801 That should fix the limits for proof size for domains