Open c4-submissions opened 1 year ago
bytes032 marked the issue as primary issue
141345 marked the issue as sufficient quality report
Invalid. Operator should limit AA validation in gas. Other concerns like that which makes some state changes in validation doesn’t really affect anything rather than infra.
miladpiri (sponsor) disputed
@miladpiri this seems to have validity in:
This could arguably be attributed to the fact that validation is non-view / has side effects
Are you disputing because the account is still paying for costs even if their tx is marked as failed?
I think after thinking about this further, the finding is mostly about the semantics of a failed and a succesful transaction
The tx is not a reverted tx, it's simply not a succesful one, meaning that:
I'd like to hear both sides for the sponsor @miladpiri and the Warden
As of now I'd have to downgrade to QA, but will refrain from doing so, will send to PJQA with a note that I'd like to hear both sides
I see this report as duplicate of https://github.com/code-423n4/2023-10-zksync-findings/issues/93 Both discussing about execution (state change) during validation.
GalloDaSballo marked the issue as selected for report
Hi @GalloDaSballo,
I agree with both of your points:
- Gas cost was paid
One can only escape the gas fee payment by reverting the bootloader during the account's validation or payment step which would in turn revert all the state changes.
This is a separate vulnerability, that is useful for escaping the gas fee on legitimately failed transactions, and is covered by #520.
- State was changed
It's worth to add that those state changes are not limited to the affected account but could also include state changes in external contracts/accounts and DeFi protocols if they were interacted with. Like in a normal successful transaction.
This is unexpected and breaks users' trust if such state changes are not reverted on a transaction which is shown as failed.
Let's continue by answering the sponsor's comments and elaborating further on the impacts:
Sponsor comment 1:
Operator should limit AA validation in gas. Other concerns like that which makes some state changes in validation doesn’t really affect anything rather than infra.
The dispute reason seems like a good mitigation measure, but it's insufficient to mitigate the present vulnerability due to the following reasons (where each reason keeps the vulnerability open on its own):
callAccountMethod(...)
function passes all the available gas to the called account method, e.g. validation or payment. Furthermore, the gas spent during account validation & payment is only checked against the overall transaction gas limit, therefore allowing to abuse the vulnerability as initially described in the report. Sponsor comment 2:
I see this report as duplicate of #93
Both discussing about execution (state change) during validation.
The present issue is not about a custom account implementation per se, but rather about the bootloader being vulnerable and therefore allowing a malicious custom account to cause the given impacts, therefore not OOS.
Although the mentioned issue seems to identify the core problem, it's lacking on development and discussion of impacts, especially regarding a runnable PoC which shows that state-preserving interactions with external contracts are indeed possible.
Nevertheless, it's fully up to the judge if the mentioned issue eventually gets duplicated to the present one or not.
Impact:
As a consequence of this vulnerability, complex state-modifying contract interactions can be executed within a custom account's validation and/or payment methods. Those are not limited to infra and could be interactions with DeFi protocols for instance. The PoC demonstrates a state-modifying interaction with an external contract.
However, all this can be done while the account's "real" execution step purposefully reverts.
Therefore, the transaction will be shown as failed by the Era Node, by the zkSync Era Block Explorer and by any Web3 provider that queries the node, while its state changes are preserved, effectively disguising the transaction.
Node output - Successful transaction (state changes preserved) shown as FAILED + gas usage:
21:31:50 [INFO] Executing 0x44253570df4ed1888231bef2acc9aa08109db192737d43d0de8018ba2a584994
21:31:50 [INFO] ┌─────────────────────────┐
21:31:50 [INFO] │ TRANSACTION SUMMARY │
21:31:50 [INFO] └─────────────────────────┘
21:31:50 [INFO] Transaction: FAILED
21:31:50 [INFO] Initiator: 0x094499df5ee555ffc33af07862e43c90e6fee501
21:31:50 [INFO] Payer: 0x094499df5ee555ffc33af07862e43c90e6fee501
21:31:50 [INFO] Gas - Limit: 1_686_370 | Used: 1_477_936 | Refunded: 208_434
Test output:
DefaultAccount tests
executeTransaction
√ successfully executed and state changed, although finalized as failed (92ms)
(Please revisit the existing PoC for more details)
Such a state vs. transaction behaviour should not be possible in any case (EVM compatibility) and opens the gate for scammers.
Malicious actors could e.g. trick victims into thinking a transfer of funds was not successful or try to cover their traces using seemingly failed transactions that still lead to successful state changes.
Two further points of thought:
Thanks and have a nice day, everyone!
It is intended behavior. The point of AA is that generally validation can include any logic. EIP4337 for instance allows that as well (any limitations on the validation step are purely done offchain and not guaranteed by the protocol). So it is just limited to infra. It is more of a question of how to display such txs in the explorer.
ERC-4337 itself requires to restrict call
/delegateCall
/staticcall
during the validation step which would make the present vulnerability impossible.
On zkSync Era it's intended to improve UX (https://era.zksync.io/docs/reference/concepts/account-abstraction.html#extending-eip4337) and thereby the present vulnerability is introduced by:
It is allowed to
call
/delegateCall
/staticcall
contracts that were already deployed.
Consequently, this also allows to circumvent the storage access limitation during validation by calling into another contract where the data can be accessed normally, see also PoC where the state/storage of another contract is modified.
Alternatively, one could also follow the exploit path through the payment step or a custom paymaster in order to create seemingly failed transactions with preserved state changes. More details are provided in this comment.
Nevertheless, if the sponsor sees this as intended and a won't fix, I don't wanna argue against that. But I hope this does not impact the validity of the submission considering the shown impacts.
Edit: Since it seems to be the consensus in this contest that the operator cannot be trusted, I further recommend to mitigate the present issue within the bootloader, i.e. on-chain, see Recommended Mitigation Steps
for my ideas.
Provide some facts in the operators code:
If the gas cost by validation is greater than this, your tx can be just blocked on the rpc sandbox, even can't enter the main node.
The AA can is allowed to call/delegateCall/staticcall contracts. But it can't read / write any storage doesn't belong to it. And it also can't emit any events or read the context of the current block, such as gasleft or timestamp. Unless, its calling KECCAK256, msgValueSimulator or the bridge token contracts, which is saved in the blockchain database by tracing bridge events.
The paymaster is not subject to these limits. If the paymaster needs to add these restrictions, I'm sure they can just be removed.
The last fact: if the tx fails in the account validation, it will be just blocked on the rpc sandbox, even can't enter the main node.
Thank you for providing those facts and confirming that the paymaster is not subject to those limitations, which keeps the vulnerability fully open (see my first comment).
Nevertheless, it's worth to mention that in spite of those limitations that should be in place, the PoC shows that another contract's storage can be written during the validation step.
The last fact: if the tx fails in the account validation, it will be just blocked on the rpc sandbox, even can't enter the main node.
Also agree with this one. However, the validation/payment step is meant to succeed in order to create "ghost transactions".
While I have made a sizeable effort to find scenarios in which a sidestep, or a broken invariant would happen
I believe that the finding is QA as these are sideeffects of having custom code to manage an account
The sandboxing shown by further comments shows how the account could be used to store extra data, e.g. some oracle data, while the main goal would be storing nonces
This is important and worth flagging
However, that data would be paid for by processing gas costs
And that data would be available only to the custom account, which ultimately defeats the purpose of storing it on chain in the first place
I also considered the return vaule of the bootloader indicating a false success while state changes happened but I also believe that:
Due to this, I believe that the best categorization for the finding is QA and invite the warden to investigate further ways to escape the sandbox which may warrant a higher severity
GalloDaSballo changed the severity to QA (Quality Assurance)
Hi @GalloDaSballo, thank you for inviting me to elaborate further on escaping the sandbox, since I feel like we are still talking past each other on this particular issue as well as #520.
In addition to my main comments (#1 and #2), the PoC of the original submission itself proves that the sandbox is escaped:
CallMe
contract and transferring funds to it from within the validation step of a custom account.CallMe
contract is totally unrelated to the custom account and could also be a DeFi protocol's contract. CallMe
contract and transferred funds are still there --> success. This state/data is available to anyone.This runnable PoC indisputably shows that state changes / fund transfers are preserved on "failed" transactions affecting unrelated contracts, not only the custom account itself. This was not prevented by restrictions or sandboxing and we have not seen any proof yet that invalidates this runnable PoC.
In the end, we have the state and balance of an unrelated contract modified by a failed transaction.
Or in other words: Sandboxing is not an issue because after the sandboxed validation step succeeds, it's still executed "for real" by the bootloader on-chain.
Throughout the comments, it became evident that the validation step is subject to restrictions while it was confirmed that such restrictions do not apply during the payment step, therefore I supplied an addition to my PoC in the first comment where the above scenario is also proven for the payment step.
If requested, I could also supply another PoC where the above scenario is done via a custom paymaster.
I really appreciate the judge digging into this and I could go on reiterating previous facts from another point of view. However, I feel like another wall of text from my side will only lead to misunderstanding each other further. Therefore I respectfully point to my main comments (#1 and #2) since, from my point of view, they sufficiently discuss impacts, PoC via payment step and ineffectiveness of restrictions.
Thank you!
Edit: Summary:
Edit 2: On Ethereum the present issue would not be possible, since EIP-4337 disallows access to any information that could introduce differences between sandboxing and actual execution during the validation step.
Assume 2 accounts A & B modify the state of contract C during validation:
From EIP-4337:
In addition, a UserOperation is only allowed to access data related to this sender address: Multiple UserOperations should not access the same storage, so that it is impossible to invalidate a large number of UserOperations with a single state change.
This scenario would be possible on zkSync Era due to the present vulnerability.
Hey @MarioPoneder,
I want to clarify a bit on the AA design:
The current bootloader in scope does not implement a mechanism that allows the operator to specifically limit gas usage on account validation or payment. Please call me out, if I overlooked it.
Operator can only include the transaction that went through the validation and payment step, it is enforced by the bootloader implementation. When operator creates a batch it runs validation of the transaction for X amount of gas. If more gas is used then operator just reject the transaction. So, the way how operator includes the transaction in the batch is the protection here.
As a consequence of this vulnerability, complex state-modifying contract interactions can be executed within a custom account's validation and/or payment methods. Those are not limited to infra and could be interactions with DeFi protocols for instance. The PoC demonstrates a state-modifying interaction with an external contract.
This is how AA works, yes, account can make state changes in the validation step while the execution step failed, all the storage changes will be shown on the block explorer properly (if not it is infra problem as was mentioned before).
Although transaction shown as failed by Web3 provider, Era Node log and zkSync Block Explorer --> Inconsistencies, EVM equivalence?
AA is an add-on feature, so all custom accounts may have specific/new behaviour. This is not EVM equivalence inconsistency since all other contracts behave exactly the same as on EVM, while only AA users/devs need to understand the new behaviour.
Lastly, thanks for having the great thread here for both warden and judge. But with all due respect I can't see the security issue here. State changes can happen in both validation/payment and execution step, failed transaction may interact with other contracts and its intended behaviour that doesn't break any specific EVM in-execution invariants. I feel this is QA as for lack of impact.
I would like more information on specifically why / how the sandbox is avoided, afaict only "trusted" slots can be touched, and those would be empty with the exception of the slots of the AA contract
Specifically want details here, as my counterpoint to the POC is that it's not using the prod setup so validation that the rust code would do is ignored
Hi @GalloDaSballo,
thank you for pointing this out. I was assuming that the Era Test Node is also running the check_user_restrictions_vm_latest function, since I could not find any indication that it's skipped. Also there seems to be no restriction of storage writes, only storage reads. Therefore, I'd need a production test setup to verify if my initial PoC would fail there (any help is appreciated).
However, I now agree that arbitrary transactions within the account's validation/payment step theoretically should not work, because the check_user_restrictions_vm_latest
function is invoked on UserTxValidation which includes an account's validation/payment step according to the bootloader hooks #1 and #2.
However, those restrictions do not apply when using a custom paymaster, i.e. PaymasterTxValidation:
For now, we support only validations for users.
See also the corresponding bootloader hooks.
As already mentioned in earlier comments, the described impacts (full execution of transaction while transaction shown as failed, i.e. state/balance changes of unrelated contracts) can be realized via a custom paymaster and thereby circumventing the sandbox.
This is possible because also the paymaster gets the full transaction data.
Please give me 12h from now to provide a PoC for the paymaster case, otherwise please let me know if you don't want/need a further PoC.
Thanks everyone for having this interesting discussion.
Edit:
The PoC is done!
Please follow the PoC steps of the initial submission, just use this CustomAccount.sol which also implements IPaymaster
and is therefore its own paymaster. Here the transaction is executed within validateAndPayForPaymasterTransaction(...)
(implemented at the bottom of CustomAccount.sol
) where no sandbox restrictions apply, see discussion above.
Furthermore, please include the paymaster info in the EIP-712 transaction of the initial test case:
txRequest.customData = {
...txRequest.customData,
customSignature: "0x1234", // custom signature, see CustomAccount._isValidSignature(...)
paymasterParams: {
paymaster: customAccount.address,
paymasterInput: "0x8c5a3445" // IPaymasterFlow.general.selector
}
};
With those changes, the PoC succeeds again and would not be subject to operator/sandbox restrictions in production.
Please note that no separate paymaster contract is needed here. It's still just a custom account where the payment and additional validation is handled via its paymaster methods and thereby fully avoiding the sandbox restrictions. Consequently, complex state-modifying interactions with other contracts can take place here while still showing the transaction as failed.
Edit 2: @GalloDaSballo
In the mean time I thought a lot about how to mitigate this one, since it's quite easy to circumvent the account's validation/payment sandbox restrictions by making it its own paymaster.
Furthermore, the paymaster is intended to make some kind of DeFi transactions, therefore one cannot impose meaningful restrictions there.
However, the bootloader could at least enforce that the paymaster and custom account are not the same contract, or enforce that the paymaster is not an account.
This way an arbitrary "ghost" transaction could not have msg.sender = address(customAccount)
, but only msg.sender = address(paymaster)
which is not the same anymore as executing the transaction through the account.
Hey @MarioPoneder, appreciate the effort you put into this report!
The whole discussion is around the fact that failed transactions may make state changes in the AA model. It is true and there is nothing wrong. I don't see any specific impact here in the thread rather than infrastructure can show transaction data better. Regarding EIP-4337, since the entry point is a publicly available smart contract - anyone can make AA transaction that bypasses safety DoS restrictions, so even on Ethereum failed AA transaction may actually change the state.
This way an arbitrary "ghost" transaction could not have msg.sender = address(customAccount), but only msg.sender = address(paymaster) which is not the same anymore as executing the transaction through the account.
I don't see any benefits of having the check, since there are many other edge cases to the problem with no clear impact. Example of such edge case customAccount1
uses customAccount2
as a paymaster.
GalloDaSballo marked the issue as not selected for report
After a long discussion above as well as discussions with the sponsor I believe I can't justify a bump to Medium Severity
AA inherently alters some fundamental behaviour of how TXs are handled in the EVM
This finding is very insightful if we assume the invariant that state changes can only happen when the Bootloader returns a success for a transaction
Due to Native AA this is not the case
As AA effectively has a "pre-turn" in which it can be used to perform state changes, the simplest of which is nonce changes
While it is arguable as to what AA should be able to access to, we can agree that for validation purpose it may be necessary to retrieve values for nonces as well as funds
This lead the Sponsor to allow AA accounts to alter state in a way that could be considered unintended
From another perspective, the AA has the ability to alter any state, since it's paying for it's gas and its still abiding by all rules of the EVM, to that end the AA "turn" is not inherently dangerous but more so a gotcha
From an architectural POV txs with native AA end up behaving in the following way
[ AA Success, Bootloader Success | Failure ]
[ AA Failure ]
Meaning that an explorer can misflag a state-changing transaction as reverted if it was using the bootloader pointer as the only source of information
However, any tx in which the AA "turn" is succesful may have state-changes, and as such displaying the tx as reverted would be incorrect, however this wouldn't break any specific invariant but rather demonstrate an issue with how the system is represented to end users.
In conclusion, due to the information available, I am not convinced I can make the case to raise the finding to Medium Severity, and due to that, I'm keeping it as Low Severity
Lines of code
https://github.com/code-423n4/2023-10-zksync/blob/1fb4649b612fac7b4ee613df6f6b7d921ddd6b0d/code/system-contracts/bootloader/bootloader.yul#L1270-L1274
Vulnerability details
Impact
Let's begin with an overview about the cornerstones of this vulnerability. Not every point of the following list is a problem itself, but in combination they are:
DefaultAccount
does not perform state changes, but a custom account can. Failure at this stage will cause the bootloader to revert (DoS of whole transaction batch).executeTransaction(...)
method. When a transaction fails, the bootloader does not revert and records the success/failure for finalization. This makes perfect sense and is necessary, otherwise we could only have successful transactions on L2 and never prove failure or see failed transactions in some kind of L2 block explorer. Furthermore, a bootloader revert in case of execution failure would cause DoS for the whole transaction batch.DefaultAccount
this is harmless, but in case of a custom account all other state changes are preserved too. Furthermore, neither the bootloader nor the Era node can exactly know which transaction caused which state changes, because they are processed in batches.validateTransaction(...)
method while purposefully reverting in itsexecuteTransaction(...)
method.As a consequence, we can have "ghost transctions" on zkSync Era where a transaction is executed successfully (state changes and transfer of funds during validation step) with all changes being preserved without logs while still being finalized as failed.
Imagine seeing a failed transaction in some kind of block explorer, but it was actually secretly executed. This could be abused for some pretty elaborate scams.
All in all, this violates a fundamental property of blockchains, since a failed transaction should never lead to any state changes (except for the deduction of transaction fees) and therefore the current vulnerability causes severe inconsistencies.
Proof of Concept
This PoC demonstrates that state changes in the
CustomAccount
and theCallMe
target contract as well as transferred funds are preserved, although the L2 transaction is finalized/confirmed as failed.Please place the following 2 contracts (linked secret gists) under
./code/system-contracts/contracts/test-contracts
:@audit-info
; performs actual execution during validation; reverts on execution; increments state variable for PoCApply the diff to the existing
DefaultAccount
test suite and runbash quick-setup.sh
from./code/system-contracts/scripts
to execute the PoC test case:Tools Used
Manual review
Recommended Mitigation Steps
staticcall
from the bootloader to enforce that no state changes can be made at this point. TheDefaultAccount
needs to be adapted accordingly.staticcall
is not feasible here. I suggest, that the payment and execution steps are placed into one near call frame in the bootloader and usenearCallPanic()
to revert those 2 steps on failure. Afterwards, the bootloader could try to deduct the transaction fee directly from the account/paymaster whithout calling one of the custom account's methods again.Assessed type
Other