Open howlbot-integration[bot] opened 4 months ago
CloudEllie marked the issue as duplicate of #522
alcueca marked the issue as not a duplicate
alcueca marked the issue as unsatisfactory: Invalid
Hi Judge,
This has been invalidated due to the fact that it was duplicated to an invalid finding:
However, our issue is not a duplicate of that finding.
Our issue describes multiple ways in which the CCIP path
can be DoS'ed, without the need of a malicious entity. Just regular usage can lead to DoS because failures are not handled gracefully.
The Sponsor has confirmed that one of the two path ways of updating a price will be used, either CCIP or Oracle:
which means that the CCIP pathway being DoS'ed would break the L2 side of things + new data will not be able to go through.
To reiterate, there is no malicious entity needed for this pathway to be DoS'ed, the reverts marked with an ->
in our issue's are all point of failures.
The recommended steps are still the same - the Sponsor should handle the failures gracefully.
Thanks!
alcueca removed the grade
@jatinj615, please review
Hi @alcueca,
I believe this finding is also invalid.
Its interpretation of the Chainlink docs:
If a user sends multiple messages and the first message isn't successfully delivered and goes into a manual execution mode, does that mean all subsequent messages from the user will also be stuck?
It depends. If a message goes into manual execution mode due to receiver errors (unhandled exceptions or gas limit issues), subsequent messages don't get automatically blocked, unless they would encounter the same error. However, suppose a message goes into manual execution mode after the Smart Execution time window expires (currently 8 hours). In that case, subsequent messages must wait for the first message to be processed to maintain the default sequence.
is incorrect. The quoted excerpt specifies that if a message goes into manual execution mode due to unhandled exceptions, subsequent messages don't get automatically blocked. The only case in which a message blocks subsequent messages is if it "goes into manual execution mode after the Smart Execution time window expires", which can only happen:
If the issue was due to extreme gas spikes or network conditions, and CCIP was not able to successfully transfer the message despite gas bumping for the entire duration of the Smart Execution time window
This is:
You skipped the following bullet point when copying the Chainlink docs:
Aka, if the CCIP path is stuck due to unhandled exceptions, the project would need to re-deploy said contracts.
For example, if the price would currently deviate more than 10%, every CCIP update would fail for length = Smart Execution Window
due to this check:
// Check for price divergence - more than 10%
if (
(_price > lastPrice && (_price - lastPrice) > (lastPrice / 10)) ||
(_price < lastPrice && (lastPrice - _price) > (lastPrice / 10))
) {
-> revert InvalidOraclePrice();
}
Bumping the gas here makes no sense.
Moreover, this statement made by the Warden above:
The only case in which a message blocks subsequent messages is if it "goes into manual execution mode after the Smart Execution time window expires", which can only happen:
- If the issue was due to extreme gas spikes or network conditions, and CCIP was not able to successfully transfer the message despite gas bumping for the entire duration of the Smart Execution time window
is factually incorrect and misleading.
In Chainlink docs, under CCIP-Execution #6.2
:
This means that a CCIP call can become eligible for the manual execution -> smart contract window expiry flow by either having a too low gas limit, which can be fixed by upping the gas, or by unhandled exceptions.
Lastly, what you are quoting is when a call becomes eligible for manual execution
. In manual execution mode, you are allowed to bump the gas. Bumping the gas here will not make the transaction succeed, which will result in a DoS
and to re-iterate, if it's the case of an unhandled exception, like described in the report:
New deployment, not bumping the gas fees.
To summarize:
from the above argument, If there is any issue with the checks. For example highlighted 10% deviation. As, protocol updates the price feeds on L2 every 6 hours if the price deviation recorded on L2 is more than that which means the feed is corrupted. Then, protocol would want to halt the deposit services on L2 until the new receiver is deployed or manually handled by owner through updatePriceByOwner
.
Thanks for the input of everyone and thanks for the acknowledgement of the Sponsor. This aligns with our report - price deviation is one of the failing paths that could lead to DoS -> redeployment.
We ran some tests yesterday using CCIP when sending messages from Avalanche Fuji <-> Sepolia
. We tested some CCIP calls
, waited for 8 hours and see what would happen.
Hats off to the CL team because integrating CCIP on the test-net was very straight forward.
This is what we have learned:
The message could not be executed on the destination chain within CCIP’s Smart Execution time window, which is currently set to 8 hours. This could happen, for example, during extreme network congestion and resulting gas spikes.
The following claim by the Warden:
an inherent risk of CCIP which will affect all protocols integrating it equally
is wrong. Projects integrating CCIP are urged to handle reverts gracefully as it could lead to a redeployment of the contracts, as per Manual Execution.
The following claim by the Warden:
unrelated to the unhandled reverts
is wrong. The whole report is based upon the fact that a message call, that is not executed during the Smart Execution Window, could never be executed, even after the Smart Execution Window, because it will always revert. For example, this is the case when the price deviates more than 10%
. Big price deviations have happened before.
With these newly tests ran and with the extra information gathered, this path could lead to a DoS, which automatically leads to losses for the users of the project. Note that any path that can lead to a reverting call on the L2 will result in the same DoS, this is just one of the many examples:
1
0.8
This could happen, for example, during extreme network congestion and resulting gas spikes.
However, suppose a message goes into manual execution mode after the Smart Execution time window expires (currently 8 hours).
To summarise:
With all this in mind, the many failing paths that can happen and with the funds that can be lost by the users, we feel like Medium is appropriate here, especially since this only needs to happen once to have a high impact on the project.
To respect everyone's and our time, we will leave the rest to the Judge, unless asked for more clarification.
Thanks!
We agree, but I think you're missing two important points, which are:
unblocking the queue requires the processing (not necessarily the success) of the blocking message.
Redeployment is only provided in the docs as a path to successful message execution, but it is not a requirement for a functioning bridge. You can see the language employed in the docs is rather lax in this regard:
As a best practice, implement fallback mechanisms in the CCIP receiver contracts to manage unhandled exceptions gracefully.
For reference:
I haven't been able to get a more complete answer than this, and their answer seems to imply that manual execution automatically blocks the queue, which I believe is not the case as per the docs, but this is irrelevant since we know the message does not need to succeed in order to unblock the path.
So redeployment is never necessary, and the finding's claim that this can lead to DoS is incorrect.
Hi,
You omitted these:
As explained in our latest comment, if a:
The docs state this, the comments in one of the helper libraries in the integration state this.
We do agree however that the likelihood is lower, due to the fact that we tested that an unhandled exception does not automatically block the queue, but as the docs/CL team/code describes, it still can be blocked and due to the many failure paths it would lead to at worse stale data, where a couple minutes will be detrimental during price volatility, even if a call would try to be manually made by an admin any adversary can just front-run it. In practice Projects will not be prepared for this scenario since they rely on the CCIP to be working.
You're arguing that the likelihood is low and we accept that since it does not change anything for the report. You are also arguing that the CL team/docs/code comments are all incorrect and that message queue being blocked is not possible, which we don't accept, nor does the CL team/docs/code comments. We are happy to be proven wrong though.
Thanks!
Ref:
unblocking the queue requires the processing (not necessarily the success) of the blocking message.
alcueca marked the issue as satisfactory
I'm going to accept this as Medium, quite borderline. Burden of proof falls on the reporting warden, which has done a considerable effort taking into account the lack of facilities provided with the code. While not conclusively proven, the risk of malfunctioning is high enough with the evidence presented. The sponsor is recommended to apply a fix.
alcueca marked the issue as selected for report
Lines of code
https://github.com/code-423n4/2024-04-renzo/blob/519e518f2d8dec9acf6482b84a181e403070d22d/contracts/Bridge/L2/PriceFeed/CCIPReceiver.sol#L66
Vulnerability details
Description
Renzo uses
CCIP
to send price updates fromL1 -> L2
using:The
CCIP
architecture allows for simultanousCCIP
calls to be made from one address. However, there is a caveat. As per Chainlink Docs, under the second header:If a message fails, it can be manually executed, but after the
Smart Execution time window
expires, which is8 hours
at the moment, all subsequent messages will fail until thefailing messsage
will succeed. The problem is that on theL2
side of things, inCCIPReceiver.sol
, the reverts are not handled gracefully. This can easily lead to aCCIP
call that can never be executed, thus resulting in a Denial of Service. These are the failure points during the_ccipReceive()
call, that can lead to aCCIP
call not being able to be delivered, called by a honest party, marked by an arrow inside_updatePrice
:For example, if the price deviates more than
10%,
the delivery will fail. If this delivery keeps failing, even after the 8 hoursSmart Execution window
has elapsed, theCCIP
pathway will be DoS'ed until thisCCIP call
succesfully delivers.The impact is broad - price data will be corrupted, minting prices will not be calculated correctly and updates from
L1
will be DoS'ed.Tools Used
Manual Review
Recommeded Mitigation Steps
As per the Chainlink docs linked in the report:
Handle the reverts gracefully.
Assessed type
Timing