howlbot-integration[bot] commented 4 months ago

Lines of code

https://github.com/code-423n4/2024-04-renzo/blob/519e518f2d8dec9acf6482b84a181e403070d22d/contracts/Bridge/L2/PriceFeed/CCIPReceiver.sol#L66

Vulnerability details

Description

Renzo uses CCIP to send price updates from L1 -> L2 using:

    function sendPrice(
        CCIPDestinationParam[] calldata _destinationParam,
        ConnextDestinationParam[] calldata _connextDestinationParam
    ) external payable onlyPriceFeedSender nonReentrant {
    // ommitted code
        for (uint256 i = 0; i < _destinationParam.length; ) {
            Client.EVM2AnyMessage memory evm2AnyMessage = Client.EVM2AnyMessage({
                receiver: abi.encode(_destinationParam[i]._renzoReceiver), // ABI-encoded xRenzoDepsot contract address
                data: _callData, // ABI-encoded ezETH exchange rate with Timestamp
                tokenAmounts: new Client.EVMTokenAmount[](0), // Empty array indicating no tokens are being sent
                extraArgs: Client._argsToBytes(
                    // Additional arguments, setting gas limit
                    Client.EVMExtraArgsV1({ gasLimit: 200_000 })
                ),
                // Set the feeToken  address, indicating LINK will be used for fees
                feeToken: address(linkToken)
            });

            // Get the fee required to send the message
            uint256 fees = linkRouterClient.getFee(
                _destinationParam[i].destinationChainSelector,
                evm2AnyMessage
            );

            if (fees > linkToken.balanceOf(address(this)))
                revert NotEnoughBalance(linkToken.balanceOf(address(this)), fees);

            // approve the Router to transfer LINK tokens on contract's behalf. It will spend the fees in LINK
            linkToken.approve(address(linkRouterClient), fees);

            // Send the message through the router and store the returned message ID
->          bytes32 messageId = linkRouterClient.ccipSend(
                _destinationParam[i].destinationChainSelector,
                evm2AnyMessage
            );
    // ommitted code
    }

The CCIP architecture allows for simultanous CCIP calls to be made from one address. However, there is a caveat. As per Chainlink Docs, under the second header:

If a user sends multiple messages and the first message isn't successfully delivered and goes into a manual execution mode, 
does that mean all subsequent messages from the user will also be stuck?
It depends. If a message goes into manual execution mode due to receiver errors (unhandled exceptions or gas limit issues), 
subsequent messages don't get automatically blocked, unless they would encounter the same error. 
However, suppose a message goes into manual execution mode after the Smart Execution time window expires (currently 8 hours). 
In that case, subsequent messages must wait for the first message to be processed to maintain the default sequence.

If a message fails, it can be manually executed, but after the Smart Execution time window expires, which is 8 hours at the moment, all subsequent messages will fail until the failing messsage will succeed. The problem is that on the L2 side of things, in CCIPReceiver.sol, the reverts are not handled gracefully. This can easily lead to a CCIP call that can never be executed, thus resulting in a Denial of Service. These are the failure points during the _ccipReceive() call, that can lead to a CCIP call not being able to be delivered, called by a honest party, marked by an arrow inside _updatePrice:


    function _ccipReceive(
        Client.Any2EVMMessage memory any2EvmMessage
    ) internal override whenNotPaused {
        address _ccipSender = abi.decode(any2EvmMessage.sender, (address));
        uint64 _ccipSourceChainSelector = any2EvmMessage.sourceChainSelector;
        // Verify origin on the price feed
        if (_ccipSender != xRenzoBridgeL1) revert InvalidSender(xRenzoBridgeL1, _ccipSender);
        // Verify Source chain of the message
        if (_ccipSourceChainSelector != ccipEthChainSelector)
            revert InvalidSourceChain(ccipEthChainSelector, _ccipSourceChainSelector);
        (uint256 _price, uint256 _timestamp) = abi.decode(any2EvmMessage.data, (uint256, uint256));
        xRenzoDeposit.updatePrice(_price, _timestamp);
        emit MessageReceived(
            any2EvmMessage.messageId,
            _ccipSourceChainSelector,
            _ccipSender,
            _price,
            _timestamp
        );
    }

    function updatePrice(uint256 _price, uint256 _timestamp) external override {
        if (msg.sender != receiver) revert InvalidSender(receiver, msg.sender);
        _updatePrice(_price, _timestamp);
    }

    function _updatePrice(uint256 _price, uint256 _timestamp) internal {
        // Check for 0
        if (_price == 0) {
->            revert InvalidZeroInput();
        }

        // Check for price divergence - more than 10%
        if (
            (_price > lastPrice && (_price - lastPrice) > (lastPrice / 10)) ||
            (_price < lastPrice && (lastPrice - _price) > (lastPrice / 10))
        ) {
->            revert InvalidOraclePrice();
        }

        // Do not allow older price timestamps
        if (_timestamp <= lastPriceTimestamp) {
->            revert InvalidTimestamp(_timestamp);
        }

        // Do not allow future timestamps
        if (_timestamp > block.timestamp) {
->            revert InvalidTimestamp(_timestamp);
        }

        // Update values and emit event
        lastPrice = _price;
        lastPriceTimestamp = _timestamp;

        emit PriceUpdated(_price, _timestamp);
    }

For example, if the price deviates more than 10%, the delivery will fail. If this delivery keeps failing, even after the 8 hours Smart Execution window has elapsed, the CCIP pathway will be DoS'ed until this CCIP call succesfully delivers.

The impact is broad - price data will be corrupted, minting prices will not be calculated correctly and updates from L1 will be DoS'ed.

Tools Used

Manual Review

Recommeded Mitigation Steps

As per the Chainlink docs linked in the report:

Test thoroughly to ensure logical conditions for all paths are gracefully handled in your receiver contract.

Handle the reverts gracefully.

Assessed type

Timing

C4-Staff commented 4 months ago

CloudEllie marked the issue as duplicate of #522

c4-judge commented 4 months ago

alcueca marked the issue as not a duplicate

c4-judge commented 4 months ago

alcueca marked the issue as unsatisfactory: Invalid

bronzepickaxe commented 4 months ago

Hi Judge,

This has been invalidated due to the fact that it was duplicated to an invalid finding:

https://github.com/code-423n4/2024-04-renzo-findings/issues/522

However, our issue is not a duplicate of that finding.

Our issue describes multiple ways in which the CCIP path can be DoS'ed, without the need of a malicious entity. Just regular usage can lead to DoS because failures are not handled gracefully.

The Sponsor has confirmed that one of the two path ways of updating a price will be used, either CCIP or Oracle:

https://github.com/code-423n4/2024-04-renzo-findings/issues/117#issuecomment-2122617996

which means that the CCIP pathway being DoS'ed would break the L2 side of things + new data will not be able to go through.

To reiterate, there is no malicious entity needed for this pathway to be DoS'ed, the reverts marked with an -> in our issue's are all point of failures.

The recommended steps are still the same - the Sponsor should handle the failures gracefully.

Thanks!

c4-judge commented 4 months ago

alcueca removed the grade

alcueca commented 4 months ago

@jatinj615, please review

0xEVom commented 4 months ago

Hi @alcueca,

I believe this finding is also invalid.

Its interpretation of the Chainlink docs:

If a user sends multiple messages and the first message isn't successfully delivered and goes into a manual execution mode, does that mean all subsequent messages from the user will also be stuck?

It depends. If a message goes into manual execution mode due to receiver errors (unhandled exceptions or gas limit issues), subsequent messages don't get automatically blocked, unless they would encounter the same error. However, suppose a message goes into manual execution mode after the Smart Execution time window expires (currently 8 hours). In that case, subsequent messages must wait for the first message to be processed to maintain the default sequence.

is incorrect. The quoted excerpt specifies that if a message goes into manual execution mode due to unhandled exceptions, subsequent messages don't get automatically blocked. The only case in which a message blocks subsequent messages is if it "goes into manual execution mode after the Smart Execution time window expires", which can only happen:

If the issue was due to extreme gas spikes or network conditions, and CCIP was not able to successfully transfer the message despite gas bumping for the entire duration of the Smart Execution time window

This is:

extremely unlikely to happen
unrelated to the unhandled reverts
an inherent risk of CCIP which will affect all protocols integrating it equally, and
avoidable by the admin bumping the gas fee during such period

bronzepickaxe commented 4 months ago

You skipped the following bullet point when copying the Chainlink docs:

manual-execution

Unhandled exception (logical error) in the receiver contract: If the receiver contract is upgradeable, developers must correct the logic, re-deploy the logic contract, and then manually execute the same transaction. If the receiver contract is not upgradeable, developers must deploy a new receiver contract, and then users can send a new CCIP message

Aka, if the CCIP path is stuck due to unhandled exceptions, the project would need to re-deploy said contracts.

For example, if the price would currently deviate more than 10%, every CCIP update would fail for length = Smart Execution Window due to this check:

        // Check for price divergence - more than 10%
        if (
            (_price > lastPrice && (_price - lastPrice) > (lastPrice / 10)) ||
            (_price < lastPrice && (lastPrice - _price) > (lastPrice / 10))
        ) {
->            revert InvalidOraclePrice();
        }

Bumping the gas here makes no sense.

Moreover, this statement made by the Warden above:

The only case in which a message blocks subsequent messages is if it "goes into manual execution mode after the Smart Execution time window expires", which can only happen:
- If the issue was due to extreme gas spikes or network conditions, and CCIP was not able to successfully transfer the message despite gas bumping for the entire duration of the Smart Execution time window

is factually incorrect and misleading.

In Chainlink docs, under CCIP-Execution #6.2:

If the message does not involve token transfers, only arbitrary messaging, and the receiver execution fails due to gas limits or unhandled exceptions, the transaction becomes eligible for manual execution.

This means that a CCIP call can become eligible for the manual execution -> smart contract window expiry flow by either having a too low gas limit, which can be fixed by upping the gas, or by unhandled exceptions.

Lastly, what you are quoting is when a call becomes eligible for manual execution. In manual execution mode, you are allowed to bump the gas. Bumping the gas here will not make the transaction succeed, which will result in a DoS and to re-iterate, if it's the case of an unhandled exception, like described in the report:

manual-execution

Unhandled exception (logical error) in the receiver contract: If the receiver contract is upgradeable, developers must correct the logic, re-deploy the logic contract, and then manually execute the same transaction. If the receiver contract is not upgradeable, developers must deploy a new receiver contract, and then users can send a new CCIP message

New deployment, not bumping the gas fees.

To summarize:

A ccip call is made at timestamp k
This ccip call will fail due to one of the many unhandled exceptions, lets go with price deviation.
Subsequent calls will still fail due to the price deviation
Timestamp is now k + eligble_for_manual_execution
Unlike what the Warden above suggests, bumping the gas price would not make it succeed.
Timestamp is now k + smart_execution_window_expired
DoS

jatinj615 commented 4 months ago

from the above argument, If there is any issue with the checks. For example highlighted 10% deviation. As, protocol updates the price feeds on L2 every 6 hours if the price deviation recorded on L2 is more than that which means the feed is corrupted. Then, protocol would want to halt the deposit services on L2 until the new receiver is deployed or manually handled by owner through updatePriceByOwner.

bronzepickaxe commented 4 months ago

Thanks for the input of everyone and thanks for the acknowledgement of the Sponsor. This aligns with our report - price deviation is one of the failing paths that could lead to DoS -> redeployment.

We ran some tests yesterday using CCIP when sending messages from Avalanche Fuji <-> Sepolia. We tested some CCIP calls, waited for 8 hours and see what would happen.

Hats off to the CL team because integrating CCIP on the test-net was very straight forward.

This is what we have learned:

Subsequent messages will be blocked if a message can not be delivered during the Smart Execution window, as per Frequently Asked Questions Nr. 2. This happens during:
- CCIP Manual Execution
- The message could not be executed on the destination chain within CCIP’s Smart Execution time window, which is currently set to 8 hours. This could happen, for example, during extreme network congestion and resulting gas spikes.
  - Note that this is just one of the examples that could lead to a CCIP message not being executed during the Smart Execution Window

The following claim by the Warden: an inherent risk of CCIP which will affect all protocols integrating it equally is wrong. Projects integrating CCIP are urged to handle reverts gracefully as it could lead to a redeployment of the contracts, as per Manual Execution.

The following claim by the Warden: unrelated to the unhandled reverts is wrong. The whole report is based upon the fact that a message call, that is not executed during the Smart Execution Window, could never be executed, even after the Smart Execution Window, because it will always revert. For example, this is the case when the price deviates more than 10%. Big price deviations have happened before.

With these newly tests ran and with the extra information gathered, this path could lead to a DoS, which automatically leads to losses for the users of the project. Note that any path that can lead to a reverting call on the L2 will result in the same DoS, this is just one of the many examples:

Current price is 1
Price deviates to 0.8
CCIP is made to L2 to update the price, which will always fail due to the deviating being more than 10%.
1. Note that this is just one of the many failing paths, refer back to our original report for the many paths that can lead to an always failing CCIP call. All these failing paths can lead to stale data.
This message call can not be delivered during the Smart Execution Window, due to, quote, for example:
1. CCIP Manual Execution
  1. This could happen, for example, during extreme network congestion and resulting gas spikes.
    1. The usage of for example indicates that this is not the only way that this can happen.
Now, to unblock the queue, the message needs to be manually delivered after the Smart Execution Window. Note that it is not possible to bump the admin gas, as per the Warden, because the queue will only be blocked after a message goes into manual execution mode after the Smart Execution Window expires, as per:
1. Frequently Asked Questions Nr. 2
  1. However, suppose a message goes into manual execution mode after the Smart Execution time window expires (currently 8 hours).
The path is DoS'ed. This can easily be prevented if the fails are handled gracefully, since manually executing after the Smart Execution Window would result in unblocking the queue.

To summarise:

The message queue can be blocked, which is normally not an issue if failures are handled gracefully since the projects can just manually execute the message if the message.
This project does not handle failures gracefully - which can lead to a DoS.
This is not likely to occur but the impact is High.

With all this in mind, the many failing paths that can happen and with the funds that can be lost by the users, we feel like Medium is appropriate here, especially since this only needs to happen once to have a high impact on the project.

To respect everyone's and our time, we will leave the rest to the Judge, unless asked for more clarification.

Thanks!

0xEVom commented 4 months ago

We agree, but I think you're missing two important points, which are:

as per the docs I quoted above, if a message results in an unhandled exception, it will go into manual execution, but it will not block subsequent message, neither then nor after the Smart Execution time window
if a message cannot be delivered (delivered does not imply success) within the Smart Execution time window, it will block the queue. It is true that the gas spikes/network congestion case is not the only way this can happen (it could be just because the gas limit provided was insufficient and the admin never bumped the gas fee, or, I suppose, an infrastructure error); I was quoting from the 3 situations leading to Manual Execution provided in the docs. However, again, that will not be due to the unhandled reverts as they will lead to the case above. Furthermore, even in that case, quoting the Chainlink devs:

unblocking the queue requires the processing (not necessarily the success) of the blocking message.

Redeployment is only provided in the docs as a path to successful message execution, but it is not a requirement for a functioning bridge. You can see the language employed in the docs is rather lax in this regard:

As a best practice, implement fallback mechanisms in the CCIP receiver contracts to manage unhandled exceptions gracefully.

For reference:

I haven't been able to get a more complete answer than this, and their answer seems to imply that manual execution automatically blocks the queue, which I believe is not the case as per the docs, but this is irrelevant since we know the message does not need to succeed in order to unblock the path.

So redeployment is never necessary, and the finding's claim that this can lead to DoS is incorrect.

bronzepickaxe commented 4 months ago

Hi,

You omitted these:

As explained in our latest comment, if a:

message that would revert because of one of the many failing paths will not be executed due to for example, network congestion or another reason(not necessarily the failure of execution as initially stated in our first comment) , as per the docs that we have linked over and over again, then
subsequent messages will be blocked.

The docs state this, the comments in one of the helper libraries in the integration state this.

We do agree however that the likelihood is lower, due to the fact that we tested that an unhandled exception does not automatically block the queue, but as the docs/CL team/code describes, it still can be blocked and due to the many failure paths it would lead to at worse stale data, where a couple minutes will be detrimental during price volatility, even if a call would try to be manually made by an admin any adversary can just front-run it. In practice Projects will not be prepared for this scenario since they rely on the CCIP to be working.

You're arguing that the likelihood is low and we accept that since it does not change anything for the report. You are also arguing that the CL team/docs/code comments are all incorrect and that message queue being blocked is not possible, which we don't accept, nor does the CL team/docs/code comments. We are happy to be proven wrong though.

Thanks!

Ref:

https://github.com/smartcontractkit/ccip/blob/d43decd2e49b2d5f08ea8f6fdd7694a86e42ad7b/integration-tests/ccip-tests/actions/ccip_helpers.go#L2833-L2837

0xEVom commented 4 months ago

unblocking the queue requires the processing (not necessarily the success) of the blocking message.

c4-judge commented 3 months ago

alcueca marked the issue as satisfactory

alcueca commented 3 months ago

I'm going to accept this as Medium, quite borderline. Burden of proof falls on the reporting warden, which has done a considerable effort taking into account the lack of facilities provided with the code. While not conclusively proven, the risk of malfunctioning is high enough with the evidence presented. The sponsor is recommended to apply a fix.

c4-judge commented 3 months ago

alcueca marked the issue as selected for report

code-423n4 / 2024-04-renzo-findings

Price updating mechanism can break #519

Lines of code

Vulnerability details

Description

Tools Used

Recommeded Mitigation Steps

Assessed type