code-423n4 / 2023-09-maia-findings

25 stars 17 forks source link

Message channels can be blocked resulting in DoS #399

Open c4-submissions opened 1 year ago

c4-submissions commented 1 year ago

Lines of code

https://github.com/code-423n4/2023-09-maia/blob/f5ba4de628836b2a29f9b5fff59499690008c463/src/CoreBranchRouter.sol#L62-L78

Vulnerability details

Impact

The communication channel between a branch chain and a destination chain can be blocked by exploiting the ability to specify arbitrary gas parameters for the destination chain function calls. As Ulysses implements a non-blocking pattern, forcing the destination chain calls to revert creates an undesirable "in blocking" state.

In addition, as the requisite functions to clear the blocked channel were not implemented, the channel will remain blocked until a successful manual call to the blocked endpoint's retryPayload (with appropriate gas parameters).

As this vulnerability is created by the confluence of multiple underlying issues these have been split into three Root Causes for clarity.

Root causes: Issue A. User can specify any gas params against best-practice guidelines
Issue B. The protocol implements a non-blocking implementation pattern for lzReceive and doesn't handle the "in-blocking scenario" Issue C. The protocol doesn't implement the ILayerZeroUserApplicationConfig functions as recommended

It should also be noted that this can be exploited to cause economic damage to the protocol. In it's current state the manual call to Endpoint::retryPayload() is the only way to clear the channel. If the griefer were to initiate a high payload size call on a low-cost branch like Polygon zkEVM or Optimism, then the team will need to pay the gas fees to process the griefing-payload at the higher gas cost on Arbitrum.

Due to the ability to consistently (and at a low cost) block communication between the root chain and branch chain (which can only be unblocked through a manual intervention) the issue has been evaluated to high.

Proof of Concept

Bob deploys an ERC20 to the local chain, for example, Polygon. Bob now adds the token via the CoreBranchRouter by calling addLocalToken. Importantly, Bob sets the gasParams as gasLimit: 1 (Issue A). This should not be allowed, as Layer Zero explicitly advises:

And the UA should enforce the gas estimate on-chain at the source chain to prevent users from inputting too low the value for gas.

The branch hToken is created and these "poisoned" gas parameters are now encoded and sent to the local BridgeAgent for a system callout.

        //Send Cross-Chain request (System Response/Request)
        IBridgeAgent(localBridgeAgentAddress).callOutSystem{value: msg.value}(payable(msg.sender), payload, _gParams);

In the local BaseBridgeAgent the callOutSystem function is called. This passes on the poisoned _gParams to _performCall.

In _performCall the poisoned params are passed through to the local chain LayerZeroEndpoint via send:

        ILayerZeroEndpoint(lzEndpointAddress).send{value: msg.value}(
            rootChainId,
            rootBridgeAgentPath,
            _payload,
            payable(_refundee),
            address(0),
            abi.encodePacked(uint16(2), _gParams.gasLimit, _gParams.remoteBranchExecutionGas, rootBridgeAgentAddress)
        );

In the Endpoint contract the outboundNonce is incremented, and the _getSendLibrary(uaConfig).send() is called.

The call now passes through to the nodes and relayers and this finally passes the call through to the try/catch block in the receivePayload() function of the Endpoint contract on Arbitrum.

Here the Endpoint contract must make a call to the destination address via alzReceve call. Such calls to lzReceive is should not fail in the Ulysses protocol (i.e. it is supposed to be non-blocking). We can see this in the lzReceive code implemented in the RootBridgeAgent.sol contract here. It should never revert a call which originates from the Endpoint contract (Issue B).

Crucially, the _gasLimit used here for the lzReceive call is the _gasLimit that was provided by Bob. But because Bob has specified a gasLimit of 1 he forces the call from the Endpoint contract to RootBridgeAgent::lzReceive() to revert due to an out-of-gas error. This causes the Endpoint contract to store the payload. This blocks any subsequent cross-chain messages for that chain which will revert with LayerZero: in message blocking.

The message channel is now blocked. The channel can only be unblocked through a manual call to Endpoint::retryPayload(); crucially, the Maia protocol team will need to bear the costs of this transaction.

Layer Zero provides a "get-out-of-jail-card" for these cases through its forceResumeReceive functionality. Unfortunately, because of Issue C, the protocol doesn't implement forceResumeReceive and thus has no other way to clear the channel without bearing the execution cost. This results in blocking the channel and communication loss.

Coded PoC

To accurately demonstrate the proof of concept we will use the below code, testProofOfConcept, and paste it into the file RootForkTest.t.sol in the contest repo. This provides us with a fork test environment which uses the actual Endpoint contract from Layer Zero.

The test uses the example of using CoreBranchRouter::addLocalToken() to demonstate Issue A, where a user-supplied gasParams can lead to an "OutOfGas" revert on the destination chain call to RootBridgeAgent:lzReceive. It then demonstrates Issue B by showing how subsequent messages to the same chain fail (even when done with valid gas parameters) due to the blocking nature of Endpoint. This, combined with issue C, where there is no implementation of forceResumeReceive(), creates a situation where a channel between the source chain and the root chain can be blocked permanently.

Instructions:

  1. Setup the contest repo as described in the contest readme (note that an Infura key and setting up the .env file is required).
  2. Copy the below code block into the RootForktest.t.sol in the test/ulysses-omnichain directory
  3. Run the test with: forge test --match-test testProofOfConcept -vvv (-vvv is necessary to show the errors for OutOfGas and LayerZero: in message blocking)
    function testProofOfConcept() public {

        //Switch Chain and Execute Incoming Packets
        switchToLzChain(avaxChainId);

        vm.deal(address(this), 10 ether);

        avaxCoreRouter.addLocalToken{value: 10 ether}(address(avaxMockAssetToken), GasParams(1, 0));

        //Switch Chain and Execute Incoming Packets

        /*  We expect the call to `RootBridgeAgent` to fail with an out of gas error,
            but this won't be caught here due to the `switchToLzChain` abstraction.
        */
        switchToLzChain(rootChainId);

        console2.log("RootBridgeAgent: ", address(arbitrumCoreBranchBridgeAgent).balance);
        avaxMockAssethToken = rootPort.getLocalTokenFromUnderlying(address(avaxMockAssetToken), avaxChainId);

        newAvaxAssetGlobalAddress = rootPort.getGlobalTokenFromLocal(avaxMockAssethToken, avaxChainId);

        // We can now assert that it failed in the root chain
        require(
            rootPort.getGlobalTokenFromLocal(avaxMockAssethToken, avaxChainId) == address(0),
            "Token should be added"
        );
        require(
            rootPort.getLocalTokenFromGlobal(newAvaxAssetGlobalAddress, avaxChainId) == address(0),
            "Token should be added"
        );
        require(
            rootPort.getUnderlyingTokenFromLocal(avaxMockAssethToken, avaxChainId) == address(0),
            "Token should be added"
        );
        switchToLzChain(avaxChainId);

        vm.deal(address(this), 10 ether);

        MockERC20 validToken = new MockERC20("Valid Token", "VALID", 18);

        avaxCoreRouter.addLocalToken{value: 10 ether}(address(validToken), GasParams(2_000_000, 0));

        /*  Here we confirm that we the channel is blocked
            We can see in the stack trace in the console that the call on the Root chain
            was reverted with "LayerZero: in message blocking"
        */
        switchToLzChain(rootChainId);

    }

Tools Used

Manual code review. Foundry.

Recommended Mitigation Steps

As this submission demonstrates a high severity impact stemming from multiple root causes, the recommendations will be provided for each.

Issue A

Layer Zero acknowledges that a cross-chain call can use more or less gas than the standard 200k. For this reason it allows the passing of custom gas parameters. This overrides the default amount of gas used. By allowing users to directly set these custom gas parameters (without validation) it opens the Ulysses implementation up to vulnerabilities related to cross-chain gas inequalities.

Consider adding input validation within the BridgeAgents before a cross-chain call is commenced that ensures the gasLimit supplied is sufficient for the lzReceive call on the root chain. This can be expanded by calculating sufficient minimums for the various functions which are implemented (e.g. adding a token, bridging). An alternative would be to deny the user the ability to modify these params downward. The BranchBridgeAgent::getFeeEstimate() is already implemented, but never used in the contracts - this would be perfect for input validation.

Issue B The current implementation is designed to never allow failures from the Layer Zero Endpoint as it implements a non-blocking pattern. Due to Issue A the lzReceive call from Endpoint can be forced to fail. This blocks the message channel, violating the intended non-blocking pattern (and giving rise to this issue).

Consider inheriting from the Layer Zero non-blocking app example.

Issue C

It is highly recommended to implement the ILayerZeroApplicationConfig as it provides access to forceResumeReceive in the case of a channel blockage and allows the protocol to resume communication between these two chains. Most importantly, it will allow the team to resume messaging at a fraction of what it might cost to call retryPayload.

Assessed type

DoS

c4-pre-sort commented 1 year ago

0xA5DF marked the issue as duplicate of #785

c4-pre-sort commented 1 year ago

0xA5DF marked the issue as sufficient quality report

0xA5DF commented 1 year ago

TODO: contains a dupe of #875 too

c4-judge commented 1 year ago

alcueca marked the issue as satisfactory

c4-judge commented 1 year ago

alcueca marked the issue as selected for report

c4-judge commented 1 year ago

alcueca marked the issue as not a duplicate

c4-judge commented 1 year ago

alcueca changed the severity to 2 (Med Risk)

alcueca commented 1 year ago

This is DoS attack. No funds are lost except gas, and as soon as the attacker stops, the application can resume operations.

lokithe5th commented 1 year ago

Dear judge,

I am in agreement that this is a DOS type attack, but I would respectfully raise the following as aggravating factors in support of my submission's original severity level:

  1. The issue describes an attack that can be executed cheaply from a low gas-cost L2.
  2. The effect of the attack is that the entire ecosystem's cross-chain accounting (and communication) system is brought to a halt (this source chain cannot communicate with the Root chain)
  3. The DoS can be reversed through a call to retryPayload - but this call will have to be done manually, and the caller will have to pay the appropriate fees at their own cost. Cross-chain communication in the interim will silently fail.

The twist here is that an attacker can simply wait until the channel has been unblocked through the call to retryPayload and then initiate another DoS attack call at a very low cost.

As a consequence an attacker effectively has the ability to interrupt src->root cross-chain communication for as long as they like, whenever they like. Effectively making Ulysses unusable from that source-chain.

In further support, there is precedent for a High severity classification of this effect, as established in this case which also involved this class of vulnerability.

Thank you for your time in reconsidering this issue. I can provide more detail should it be required and will accept your ruling on this issue.

c4-sponsor commented 1 year ago

0xLightt (sponsor) confirmed

alcueca commented 1 year ago

As a consequence an attacker effectively has the ability to interrupt src->root cross-chain communication for as long as they like, whenever they like. Effectively making Ulysses unusable from that source-chain.

You are describing a DoS attack. Funds have not changed wallets. If the FBI puts a bullet in the attacker's head, the protocol will return to normal operation. Note that the attacker needs to keep doing something for the situation to persist. In a High the situation happened, and that bullet wouldn't solve anything (quite the opposite, probably).

lokithe5th commented 1 year ago

Thank you for your comment @alcueca. I've made my case as best I could and accept your decision. Appreciate you reviewing it again.

0xBugsy commented 1 year ago

Addressed at https://github.com/Maia-DAO/2023-09-maia-remediations/commit/4d2825cd4e73a69ad33bccaf2ceb07443f127beb