code-423n4 / 2023-09-maia-findings

25 stars 17 forks source link

LayerZero endpoint can get blocked by a malicious user (or even a honest one) #883

Closed c4-submissions closed 1 year ago

c4-submissions commented 1 year ago

Lines of code

https://github.com/code-423n4/2023-09-maia/blob/main/src/RootBridgeAgent.sol#L423-L431

Vulnerability details

Description

Contract Endpoint, from LayerZero is the one responsible of sending/receiving messages to/from other chains. Specifically it has function receivePayload, which is called by contract UltraLightNodeV2 (the current default library of the protocol) after validating the transaction proof.

function receivePayload(uint16 _srcChainId, bytes calldata _srcAddress, address _dstAddress, uint64 _nonce, uint _gasLimit, bytes calldata _payload) external override receiveNonReentrant {
    // assert and increment the nonce. no message shuffling
    require(_nonce == ++inboundNonce[_srcChainId][_srcAddress], "LayerZero: wrong nonce");

    LibraryConfig storage uaConfig = uaConfigLookup[_dstAddress];

    // authentication to prevent cross-version message validation
    // protects against a malicious library from passing arbitrary data
    if (uaConfig.receiveVersion == DEFAULT_VERSION) {
        require(defaultReceiveLibraryAddress == msg.sender, "LayerZero: invalid default library");
    } else {
        require(uaConfig.receiveLibraryAddress == msg.sender, "LayerZero: invalid library");
    }

    // block if any message blocking
    StoredPayload storage sp = storedPayload[_srcChainId][_srcAddress];
    require(sp.payloadHash == bytes32(0), "LayerZero: in message blocking");

    try ILayerZeroReceiver(_dstAddress).lzReceive{gas: _gasLimit}(_srcChainId, _srcAddress, _nonce, _payload) {
        // success, do nothing, end of the message delivery
    } catch (bytes memory reason) {
        // revert nonce if any uncaught errors/exceptions if the ua chooses the blocking mode
        storedPayload[_srcChainId][_srcAddress] = StoredPayload(uint64(_payload.length), _dstAddress, keccak256(_payload));
        emit PayloadStored(_srcChainId, _srcAddress, _dstAddress, _nonce, _payload, reason);
    }
}

This function is the one that is responsible of executing the lzReceive function in the destination user application (UA in LayerZero terminology).

The last lines of the function is where we have to put our eyes to understand the issue reported here.

First, it looks for a StoredPayload value in a mapping, for the source chain and source address of the message:

// block if any message blocking
StoredPayload storage sp = storedPayload[_srcChainId][_srcAddress];

Then, it checks that the payload hash stored is empty, otherwise it reverts and the message cannot be delivered.

require(sp.payloadHash == bytes32(0), "LayerZero: in message blocking");

Lastly, it tries to execute the function lzReceive in the destination address.

try ILayerZeroReceiver(_dstAddress).lzReceive{gas: _gasLimit}(_srcChainId, _srcAddress, _nonce, _payload) {
    // success, do nothing, end of the message delivery
} catch (bytes memory reason) {
    // revert nonce if any uncaught errors/exceptions if the ua chooses the blocking mode
    storedPayload[_srcChainId][_srcAddress] = StoredPayload(uint64(_payload.length), _dstAddress, keccak256(_payload));
    emit PayloadStored(_srcChainId, _srcAddress, _dstAddress, _nonce, _payload, reason);
}

The execution happens in a try-catch block.

If the execution succeeds, the message is considered delivered by LayerZero.

But if the execution fails, the keccac of the payload, along with its length and the destination address are stored in the storedPayload. This means that now we have a blocking message that will not allow the endpoint to process any more messages that come from _srcChainId and _srcAddress. Because of these two lines:

// block if any message blocking
StoredPayload storage sp = storedPayload[_srcChainId][_srcAddress];
require(sp.payloadHash == bytes32(0), "LayerZero: in message blocking");

So what now? How can we unblock the system? Well, LayerZero's Endpoint has a couple of functions that can help us out with this.

Fisrt, we have retryPayload:

function retryPayload(uint16 _srcChainId, bytes calldata _srcAddress, bytes calldata _payload) external override receiveNonReentrant {
    StoredPayload storage sp = storedPayload[_srcChainId][_srcAddress];
    require(sp.payloadHash != bytes32(0), "LayerZero: no stored payload");
    require(_payload.length == sp.payloadLength && keccak256(_payload) == sp.payloadHash, "LayerZero: invalid payload");

    address dstAddress = sp.dstAddress;
    // empty the storedPayload
    sp.payloadLength = 0;
    sp.dstAddress = address(0);
    sp.payloadHash = bytes32(0);

    uint64 nonce = inboundNonce[_srcChainId][_srcAddress];

    ILayerZeroReceiver(dstAddress).lzReceive(_srcChainId, _srcAddress, nonce, _payload);
    emit PayloadCleared(_srcChainId, _srcAddress, nonce, dstAddress);
}

This function can be called by anyone and it would retry the execution that failed the first time, but now without the gas limit that came in the message from the source chain.

And if the call continually fails, does it mean we have nothing else to do? No, Endpoint provides a last resort function:

function forceResumeReceive(uint16 _srcChainId, bytes calldata _srcAddress) external override {
    StoredPayload storage sp = storedPayload[_srcChainId][_srcAddress];
    // revert if no messages are cached. safeguard malicious UA behaviour
    require(sp.payloadHash != bytes32(0), "LayerZero: no stored payload");
    require(sp.dstAddress == msg.sender, "LayerZero: invalid caller");

    // empty the storedPayload
    sp.payloadLength = 0;
    sp.dstAddress = address(0);
    sp.payloadHash = bytes32(0);

    // emit the event with the new nonce
    emit UaForceResumeReceive(_srcChainId, _srcAddress);
}

This function deletes the content of the storedPayload mapping for the _srcChainId and _srcAddress, thus unblocking our UA, that now will be able to receive messages again. But this function can only be called from our UA, that is Maia's RootBridgeAgent or BranchBridgeAgent contracts.

The thing is none of these contracts is prepared to execute a call to function forceResumeReceive. This means if anytime a message is sent that cannot be executed by whatever reason, the Endpoint would get stuck and will not be able to process any more messages from the source address and the source chain.

LayerZero documentation states explicitly that:

It is highly recommended User Applications implement the ILayerZeroApplicationConfig. Implementing this interface will provide you with forceResumeReceive which, in the worse case can allow the owner/multisig to unblock the queue of messages if something unexpected happens.

(Source: https://layerzero.gitbook.io/docs/evm-guides/best-practice)

So not having this function that allows to unblock the Endpoint is a serious vulnerability.

And that is because calls to lzReceive in the Maia code, in both RootBridgeAgent and BranchBridgeAgent contracts, ultimately excute function(s) _execute. There are two of them, in both contracts, one just tries to execute and reverts on failure, and the other tries to execute and if it fails, it sends a message back to the source chain (provided _hasFallbackToggled is set to true). To make it simple, we can check function _execute in RootBridgeAgent:

function _execute(uint256 _depositNonce, bytes memory _calldata, uint16 _srcChainId) private {
    //Update tx state as executed
    executionState[_srcChainId][_depositNonce] = STATUS_DONE;

    //Try to execute the remote request
    (bool success,) = bridgeAgentExecutorAddress.call{value: address(this).balance}(_calldata);

    // No fallback is requested revert allowing for retry.
    if (!success) revert ExecutionFailure();
}

We can clearly see that if the call to the bridge agent executor with the payload provided fails, the excution reverts, and this would make the Endpoint store the payload for future execution, potentially blocking it indefinitely.

Impact

Impact of this issue is Critical, as the LayerZero messaging between chains could get permanently blocked.

Proof of Concept

The fact that this is an issue that involves sending messages via LayerZero from one chain to another, along with the lack of such testing in the project's own tests has made it very difficult for me to code a PoC.

I hope the above description illustrates the issue with enough detail for the judge and sponsor to value its severity.

Now I will list some steps that could make one chain unusable for Maia:

  1. A User calls callOut in a BranchRouter:
  2. The BranchRouter then calls callOut in the BranchBridgeAgent, which calls internal call _performCall which calls send in the LayerZero endpoint to send a message to the Root chain's RootBridgeAgent. The payload sent is abi.encodePacked(bytes1(0x01), depositNonce++, _params), with _params being user supplied.
  3. The message arrives to the Root chain and the UltraLightNodeV2 validates the proof and calls Endpoint's receivePayload, which in turn calls RootBridgeAgent's lzReceive.
  4. The flag sent is 0x01 so // DEPOSIT FLAG: 1 (Call without Deposit). Function _execute is called with the following values:
    _execute(
    nonce,
    abi.encodeWithSelector(
        RootBridgeAgentExecutor.executeNoDeposit.selector, localRouterAddress, _payload, srcChainId
    ),
    srcChainId
    );
  5. So function executeNoDeposit is called in the RootBridgeAgentExecutor with, among others, the user supplied _payload.
  6. The executor calls function execute in contract localRouterAddress passing in the user supplied payload (note below that the payload passed in is stripping the first 5 bytes, PARAMS_TKN_START, which correspond to the flag and the nonce) and the source chain id.
    IRouter(_router).execute{value: msg.value}(_payload[PARAMS_TKN_START:], _srcChainId);
  7. The router execute function, wether it being MulticallRootRouter or CoreRootRouter, take the first byte and check the function id, if it is not 1 ( CoreRootRouter) or 1 to 3 (MulticallRootRouter) the call reverts.
  8. So if we assume the user supplied this first byte with, say 0xff, the execution would fail and the Endpoint would have added it to the stored payloads mapping. And if anyone tried to retry the execution via retryPayload, execution would fail again for the same reason.
  9. Not having implemented a way to call forceResumeReceive would mean the source branch would be unable to send messages ever again to the Root branch.

Tools Used

Manual review

Recommended Mitigation Steps

Implement ILayerZeroApplicationConfig interface in both RootBridgeAgent and BranchBridgeAgent and implement a function forceResumeReceive that in turn calls forceResumeReceive in the LayerZero Endpoint.

Assessed type

Other

c4-pre-sort commented 1 year ago

0xA5DF marked the issue as duplicate of #785

c4-pre-sort commented 1 year ago

0xA5DF marked the issue as sufficient quality report

c4-judge commented 1 year ago

alcueca changed the severity to 2 (Med Risk)

c4-judge commented 1 year ago

alcueca marked the issue as satisfactory

alcueca commented 1 year ago

I would give best report with a working PoC. Without it, I have to check all other duplicates to make sure that the attack vector (using a reverting call, instead of out of gas) works.

alcueca commented 1 year ago

From the sponsor:

the warden seems to have missed the fact that our execution is encapsulated by an excessivelySafeCall and as such the revert being described would not bubble up until the LayerZeroEndpoint and would be caught by the bridge agent.

c4-judge commented 1 year ago

alcueca marked the issue as partial-50

alcueca commented 1 year ago

50% for lack of PoC (when others have delivered one for the same vulnerability) and for not correctly identifying a revert (out of gas).

neumoxx commented 1 year ago

The sponsor is right, and I was wrong :(