sherlock-audit / 2024-08-winnables-raffles-judging

3 stars 1 forks source link

0x73696d616f - `CCIPClient` `whenHealthy` modifier will lead to stuck `ETH` due to DoSing claim and cancel #236

Closed sherlock-admin3 closed 3 weeks ago

sherlock-admin3 commented 2 months ago

0x73696d616f

Medium

CCIPClient whenHealthy modifier will lead to stuck ETH due to DoSing claim and cancel

Summary

CCIPClient has a whenHealthy modifier in the ccipSend() function, which means it can DoS _sendCCIPMessage() calls in WinnablesTicketManager. This would be particularly harmful in several scenarios:

  1. In case a raffle does not meet the minimum tickets threshold, it must be canceled. However, cancelling sets the status to CANCELED and allows users to claim refunds, but also sends a message to WinnablesPrizeManager to allow the admins to get their funds back. If the router is not healthy, it will revert. This procedure should be perfomed in a 2 step such that users can get their refunds right away, as they don't need to wait for the ccip router to work.
  2. Users buy tickets but the router is DoSed and WinnablesTicketManager::propagateRaffleWinner() reverts when calling _sendCCIPMessage(). This means that the protocol can never claim its ETH although the cross chain message was not required to be successful. A two step procedure would also fix this.

Scenario 1 breaks the specification in the readme

Participants in a raffle that got cancelled can always get refunded

Root Cause

The Chainlink Router has the whenHealthy modifier in ccipSend(), called in _sendCCIPMessage(), which DoSes the router as can be seen in the code linked above in lines 293-296.

WinnablesTicketManager does not deal with the notHealthy modifier.

Internal pre-conditions

None.

External pre-conditions

Chainlink pauses the Router.

Attack Path

The examples are given: A

  1. Users participate by calling WinnablesTicketManager::buyTickets().
  2. Not enough tickets were bought so the raffle should be canceled, but Chainlink DoSes the router.
  3. WinnablesTicketManager::cancelRaffle() calls the Chainlink router to send a message, but it reverts due to the modifier. Users can not get their refunds back until the Chainlink router is back up.

B

  1. Users participate by calling WinnablesTicketManager::buyTickets().
  2. Chainlink DoSes the router after the raffle ends, DoSing WinnablesTicketManager::propagateRaffleWinner().
  3. The protocol can not claim the locked ETH due to point 2 even though the cross chain message was not required.

Impact

In scenario A, users can not claim their refunds until the router is back up. In B, the protocol can not claim the ETH back even though it could be safely retrieved.

PoC

Check the mentioned Chainlink router links and the fact that the code never checks if the router is not healthy before calling _sendCCIPMessage().

Mitigation

The WinnablesTicketManager::cancelRaffle() and WinnablesTicketManager::propagateRaffleWinner() functions should be split into 2 separate steps, to always make sure users or the protocol can get their funds.

imp0wd3r commented 1 month ago

Escalate

https://docs.chain.link/ccip/concepts#offchain-risk-management-node CCIP will only be suspended in the following two situations:

There are two cases where Risk Management nodes pause CCIP:

Finality violation: A deep reorganization which violates the safety parameters set by the Risk Management configuration occurs on a CCIP chain.

Execution safety violation: A message is executed on the destination chain without any matching transaction being on the source chain. Double executions fall into this category since the executing DON can only execute a message once.

The likelihood of both situations occurring is very low, and if they do happen, it might be a chain issue, such as a reorg, making operations on the chain unsafe for users.

Additionally, there is no reason for Chainlink to DoS the Router, so I consider it at most to be Low.

As for another dup report https://github.com/sherlock-audit/2024-08-winnables-raffles-judging/issues/302 , it is also incorrect. CCIP can manually execute failed messages https://docs.chain.link/ccip/tutorials/manual-execution . The probability of CCIP failure is very low, and even if it fails, the request can be sent manually. If the issue is due to the receiver contract, the reason should be pointed out, rather than assuming there will be a problem.

sherlock-admin3 commented 1 month ago

Escalate

https://docs.chain.link/ccip/concepts#offchain-risk-management-node CCIP will only be suspended in the following two situations:

There are two cases where Risk Management nodes pause CCIP:

Finality violation: A deep reorganization which violates the safety parameters set by the Risk Management configuration occurs on a CCIP chain.

Execution safety violation: A message is executed on the destination chain without any matching transaction being on the source chain. Double executions fall into this category since the executing DON can only execute a message once.

The likelihood of both situations occurring is very low, and if they do happen, it might be a chain issue, such as a reorg, making operations on the chain unsafe for users.

Additionally, there is no reason for Chainlink to DoS the Router, so I consider it at most to be Low.

As for another dup report https://github.com/sherlock-audit/2024-08-winnables-raffles-judging/issues/302 , it is also incorrect. CCIP can manually execute failed messages https://docs.chain.link/ccip/tutorials/manual-execution . The probability of CCIP failure is very low, and even if it fails, the request can be sent manually. If the issue is due to the receiver contract, the reason should be pointed out, rather than assuming there will be a problem.

You've created a valid escalation!

To remove the escalation from consideration: Delete your comment.

You may delete or edit your escalation comment anytime before the 48-hour escalation window closes. After that, the escalation becomes final.

0xsimao commented 1 month ago

It breaks one of the main invariants of the protocol, so medium is appropriate given the likelihood.

Participants in a raffle that got cancelled can always get refunded

Brivan-26 commented 1 month ago

It breaks one of the main invariants of the protocol, so medium is appropriate given the likelihood.

Participants in a raffle that got cancelled can always get refunded

The likelihood is very rare. @imp0wd3r provided good reasons for that.

I want to add that even if the router is DoSed, all transactions will revert on the source chain, and so, when the router goes live again, those transactions can be re-initiated again to finalize the process.

Oblivionis214 commented 1 month ago

There are two cases where Risk Management nodes pause CCIP:

Finality violation: A deep reorganization which violates the safety parameters set by the Risk Management configuration occurs on a CCIP chain.

Execution safety violation: A message is executed on the destination chain without any matching transaction being on the source chain. Double executions fall into this category since the executing DON can only execute a message once.

Scenario 1: Chain reorg -> invalid per sherlock rules: Chain re-org and network liveness related issues are not considered valid. Scenario 2: CCIP get cracked -> there is nothing we can do with it, this is not a contract issue in current scope.

Waydou21 commented 1 month ago

if

There are two cases where Risk Management nodes pause CCIP: Finality violation: A deep reorganization which violates the safety parameters set by the Risk Management configuration occurs on a CCIP chain. Execution safety violation: A message is executed on the destination chain without any matching transaction being on the source chain. Double executions fall into this category since the executing DON can only execute a message once.

Scenario 1: Chain reorg -> invalid per sherlock rules: Chain re-org and network liveness related issues are not considered valid. Scenario 2: CCIP get cracked -> there is nothing we can do with it, this is not a contract issue in current scope.

if the chain congestion for example causes failure for people to get their funds, it's valid issue as the protocol should have a failsafe for such a case as the impact is high

DemoreXTess commented 1 month ago

@Brivan-26 @0xsimao @Oblivionis214 @Waydou21

First of all, we can't identify issues based on their likelihood using the Sherlock rules. The only thing we need to focus on is the impact of the issue.

In terms of impact, if we encounter a situation where the whenHealthy modifier causes the contract to get stuck, both functions will revert. As a result, there will be no changes to storage variables, meaning there is no risk of fund loss or locked ETH.

We could categorize this as a Denial of Service (DoS) issue, but according to the Sherlock rules, DoS issues are only acknowledged in the following two situations:

1) The issue causes locking of funds for users for more than a week.

2) The issue impacts the availability of time-sensitive functions (cutoff functions are not considered time-sensitive). If at least one of these are describing the case, the issue can be a Medium. If both apply, the issue can be considered of High severity. Additional constraints related to the issue may decrease its severity accordingly. Griefing for gas (frontrunning a transaction to fail, even if can be done perpetually) is considered a DoS of a single block, hence only if the function is clearly time-sensitive, it can be a Medium severity issue.

For the first criterion, there is no guarantee that users' funds would be locked for more than a week. Historically, the health state has only changed to "cursed" twice, and in both cases, the issue was resolved in less than a week You can check this from here

For the second criterion, neither of the functions is time-sensitive since there is no deadline for canceling or propagating the winner.

Additionally, it does not violate the main invariant:

Participants in a raffle that got cancelled can always get refunded

There are no locked tokens or ETH, and in the event this health scenario occurs, all users will eventually be able to claim their tokens.

In conclusion, I agree with the escalation and believe that this is not a valid issue based on the Sherlock rules.

0xsimao commented 1 month ago

It does violate the invariant temporarily (in fact, possibly forever, depending on Chainlink's resolution), hence valid. Can users always get their refund? No, in case Chainlink is cursed.

kuprumxyz commented 1 month ago

Was reading this out of curiosity...

It does violate the invariant temporarily (in fact, possibly forever, depending on Chainlink's resolution), hence valid. Can users always get their refund? No, in case Chainlink is cursed.

There is an exact quantifier on "temporarily" in the rules:

The issue causes locking of funds for users for more than a week.

As demonstrated by the comment above:

Historically, the health state has only changed to "cursed" twice, and in both cases, the issue was resolved in less than a week

0xsimao commented 1 month ago

The readme is above the rules so it is valid.

DemoreXTess commented 1 month ago

@0xsimao

We can't validate the submissions based on assumptions related to external protocols, such as:

It does violate the invariant temporarily (in fact, possibly forever, depending on Chainlink's resolution)

If we proceed this way, we must first consider the protocol itself. For example, if Chainlink CCIP causes the ARM state to remain in a cursed state indefinitely, the entire protocol would become unusable at that point.

This approach would mean that any protocol relying on external protocols has one definite, valid, and easy submission: "If the X protocol stops working, then this protocol stops working."

Additionally, the README does not validate the submission; it states:

Participants in a raffle that got cancelled can always get refunded

Technically, the raffle is not canceled because the cancelRaffle() function would revert in this situation. However, if it does not revert (meaning the ARM is not cursed), then users can always get refunded.

In conclusion, there is no contradiction.

kuprumxyz commented 1 month ago

The readme is above the rules so it is valid.

Ah, now I understand what you are talking about: this quote from the README:

Participants in a raffle that got cancelled can always get refunded

I believe the intended interpretation of "always" from the README is not temporal, i.e. "always" as "any time". I believe the intended interpretation of "always" is "100%", i.e. "If a raffle got cancelled, the participants will for sure get refunded". The temporal interpretation of "always" is not feasible anyway due to the reliance on too many external systems working always.

0xsimao commented 1 month ago

It is not an assumption, the modifier is there and it halts refunds, breaking the readme.

Technically, the raffle is not canceled because the cancelRaffle() function would revert in this situation.

The readme is broken because the statement assumes rounds can be canceled, which is not true due to this bug.

0xsimao commented 1 month ago

I believe the intended interpretation of "always" from the README is not temporal, i.e. "always" as "any time". I believe the intended interpretation of "always" is "100%", i.e. "If a raffle got cancelled, the participants will for sure get refunded". The temporal interpretation of "always" is not feasible anyway due to the reliance on too many external systems working always.

It goes without saying that refunds should always be available in terms of certainty. By including the statement in the readme the temporal nature of the word "always" is reinforced, namely that refunds should be available at any time. As such, the intention of the sponsor with the statement is finding issues that stop refunds from always being available, such as this one.

DemoreXTess commented 1 month ago

We can't claim the README is broken when validating a submission using the exact same statement found in the README. We acknowledged that the README takes precedence over Sherlock's rules, which is correct. Therefore, we should validate the submission based on the README since it doesn't conform to Sherlock's rules, and the README is the only source we can rely on. Additionally, there is no statement in the README saying, "Raffles can always be canceled by the admin if they haven't started.".

Mouradif commented 1 month ago

I mostly have a problem with

Mitigation

The WinnablesTicketManager::cancelRaffle() and WinnablesTicketManager::propagateRaffleWinner() functions should be split into 2 separate steps, to always make sure users or the protocol can get their funds.

How does that fix the problem? If CCIP is boken, Winnables is broken there's nothing to do here

DemoreXTess commented 1 month ago

@Mouradif

It's not a problem. It states to cancel the raffle and refund the tokens back to users, but not to send the CCIP message. Instead, you should call another function and this time send the CCIP message to the other chain.

DemoreXTess commented 1 month ago

However, if the CCIP stops working forever, the admin won't be able to withdraw the raffle prize from the Prize Manager contract. This introduces another issue. To ensure security, a trusted emergency admin might be needed for the Prize Manager, but this also leads to a high centralization problem, as the admin could potentially manipulate the raffles.

@Mouradif

It's not a problem. It states to cancel the raffle and refund the tokens back to users, but not to send the CCIP message. Instead, you should call another function and this time send the CCIP message to the other chain.

Mouradif commented 1 month ago

@DemoreXTess Honestly I'd rather be vulnerable to CCIP stopping to work completely than to a malicious admin.

0xsimao commented 1 month ago

I can explain the mitigation in more detail. The CCIP message on cancelRaffle() can be sent later, so users can always get their refunds, as per the readme. So something like the following (might not be exactly like it but you get the point):

function cancelRaffle(address prizeManager, uint64 chainSelector, uint256 raffleId) external {
    _checkShouldCancel(raffleId);
    _raffles[raffleId].status = RaffleStatus.CANCELED;
    IWinnablesTicket(TICKETS_CONTRACT).refreshMetadata(raffleId);
}

function propagateCancelledRaffle() external {
    if (_raffles[raffleId].status != RaffleStatus.CANCELED) revert InvalidRaffleStatus();
    _raffles[raffleId].status = RaffleStatus.CANCELED_FORWARDED;
    _sendCCIPMessage(
        prizeManager,
        chainSelector,
        abi.encodePacked(uint8(CCIPMessageType.RAFFLE_CANCELED), raffleId)
    );
}
Mouradif commented 1 month ago

Got it. So if a problem happens with the CCIP Router and a raffle would be canceled, then it sucks for the prize remaining locked forever in the prize manager but at least we can refund the players. Yeah that works I guess. If the raffle is not canceled however, what would be the fix?

Brivan-26 commented 1 month ago

I'm having issues with this @Mouradif

Got it. So if a problem happens with the CCIP Router and a raffle would be canceled, then it sucks for the prize remaining locked forever in the prize manager

Even if CCIP router is DoS, can't we just cancel the raffle after it is live again? How it is locked forever?

Mouradif commented 1 month ago

I'm having issues with this @Mouradif

Got it. So if a problem happens with the CCIP Router and a raffle would be canceled, then it sucks for the prize remaining locked forever in the prize manager

Even if CCIP router is DoS, can't we just cancel the raffle after it is live again? How it is locked forever?

What @DemoreXTess means is if CCIP stopped working for good. If it's just temporarily down then we have no problem

Brivan-26 commented 1 month ago

if CCIP stopped working for good pre-condition is an invalid assumption per Sherlock rules. Then, we can report similar issues to any protocol integrating CCIP.

I won't add any comments on this issue. The escalation comment said it all.

DemoreXTess commented 1 month ago

@0xsimao @Mouradif

It looks like in the end, we need to determine whether this issue is valid based on the word "always" in the README:

Participants in a raffle that got cancelled can always get refunded

But is it truly refundable for everyone, including the admin, as the term "always" implies?

Mouradif commented 1 month ago

@DemoreXTess if we want to stick with wording then here's a wild take: if CCIP is down, the raffle can't be canceled. Only participant of raffles that actually get canceled need to get refunded for the README statement to hold

DemoreXTess commented 1 month ago

@Mouradif

Exactly. By the way, I don’t want to escalate this issue based solely on the word "always," but in the end, this submission relies entirely on that word for its validity. In the end, the user will eventually receive their refunds.

CarlosAlegreUr commented 1 month ago

Escalate

https://docs.chain.link/ccip/concepts#offchain-risk-management-node CCIP will only be suspended in the following two situations:

There are two cases where Risk Management nodes pause CCIP: Finality violation: A deep reorganization which violates the safety parameters set by the Risk Management configuration occurs on a CCIP chain. Execution safety violation: A message is executed on the destination chain without any matching transaction being on the source chain. Double executions fall into this category since the executing DON can only execute a message once.

The likelihood of both situations occurring is very low, and if they do happen, it might be a chain issue, such as a reorg, making operations on the chain unsafe for users.

Additionally, there is no reason for Chainlink to DoS the Router, so I consider it at most to be Low.

As for another dup report #302 , it is also incorrect. CCIP can manually execute failed messages https://docs.chain.link/ccip/tutorials/manual-execution . The probability of CCIP failure is very low, and even if it fails, the request can be sent manually. If the issue is due to the receiver contract, the reason should be pointed out, rather than assuming there will be a problem.

I agree that this is not a valid finding. Specially because of the manual execution feature CCIP has in case a chain is cursed, which is what is checked in the whenNotHeatlhy modifier.

If a chain is cursed and your message has been reverted and not sent in a lapse of 8h, manual execution can be carried out by the sender of the cross-chain tx and actually execute the txs. So at most the negative impact on the protocol can last 8h, and that is clearly less than the minimum 7 days DOS allowed by Sherlock and not overiden in this contest: see here.

This issue is invalid for all the reasons mentioned.

However I do think the protocol does not handle CCIP properly for more reasons, mentioned in my invalidated issue #222 . I would encourage people to read it.

Oblivionis214 commented 1 month ago

Participants in a raffle that got cancelled can always get refunded

This statement should not override sherlock rules about invalid issues.

This issue mentions CCIP get paused, what if Ethereum get paused? what if Avalanche get paused? what if NFT/ERC20 owner grief their users? There is nothing we can do with them. I believe we should not stick to word-game here. Please stick to fact. The chain reorg to pause CCIP is not a concern per sherlock rules.

mystery0x commented 1 month ago

Should be a valid medium as checking CCIP Router's whenHealthy before sending messages is essential.

DemoreXTess commented 1 month ago

@mystery0x

Could we get additional information on why it is considered a valid medium? I also don't understand what impact checking the whenHealthy state has on security. If we do not check it, it does not appear to change anything in the contract. Nothing is locked, and nothing becomes unreachable.

Brivan-26 commented 1 month ago

@mystery0x Excuse me but I was expecting a more thorough explanation regarding why this issue may be considered Medium severity, not one general statement.

Ultimately, you can do nothing about the Chainlink router experiencing a DoS, which has a very low probability of occurring. The only potential impact is the loss of a small amount of LINK tokens during the initiation of a cross-chain message when the router is DoSed. However, the financial loss in terms of LINK tokens is negligible, given the low likelihood of this situation.

@WangSecurity We need your input here

WangSecurity commented 1 month ago

I have one question about the ability to manually send CCIP messages. As I understand from the escalation comment and this comment it allows you to call cancelRaffle and propogateRaffleWinner, so in the first case users are able to get the refunds and in the second the winner is able to get the prize, correct?

Note: I see there are other concerns regarding the README statement, I will answer them, once we confirm the above statement

Brivan-26 commented 1 month ago

I have one question about the ability to manually send CCIP messages. As I understand from the escalation comment and this comment it allows you to call cancelRaffle and propogateRaffleWinner, so in the first case users are able to get the refunds and in the second the winner is able to get the prize, correct?

Note: I see there are other concerns regarding the README statement, I will answer them, once we confirm the above statement

Yes, @WangSecurity Check chainlink docs. Also, curious to hear your thoughts about this comment

0xsimao commented 1 month ago

@WangSecurity when the chain is cursed, it's never possible to cancel rounds and send refunds, you can check the links in this issue's body. The modifier makes the cancel function revert, which never changes the status of the round to canceled, which means refunds can not be issued.

If you take a look at the suggested mitigation here, it completely eliminates the dependence on Chainlink to cancel rounds, that is, even if Chainlink reverts, refunds can always be issued, just like the readme says.

Brivan-26 commented 1 month ago

@0xsimao @WangSecurity If any of you can reply to this comment, I will stop escalating this issue. I said it all there

0xsimao commented 1 month ago

@Brivan-26 I did reply to your comment, it can be fixed by splitting the function into 2 parts, as mentioned here.

0xsimao commented 1 month ago

And also, manually execution has nothing to do with the chain being cursed. When the chain is cursed, the message is not even relayed in the first place, it just reverts, there is nothing to retry yet.

Brivan-26 commented 1 month ago

@Brivan-26 I did reply to your comment, it can be fixed by splitting the function into 2 parts, as mentioned here.

I did check your reply there, it suggests a fix to make the codebase perfect (6th point mentioned in the comment); being able to cancel raffles even if Chainlink router is DoSed. However it does not address the first 5 points mentioned here

0xsimao commented 1 month ago

The likelihood of this scenario occurring is extremely low, and if it were to happen, it would likely be due to a significant chain issue such as a reorg. In such cases, operations across the chain would become unsafe for users, which reasonably justifies why cross-chain messages should be restricted.

It is a constraint in Sherlock's judging system and that's all. It means the issue may be downgraded from medium to high. The second statement is false because refunds should always be available and can be achieved with the mentioned fix.

You mentioned:

checking CCIP Router's whenHealthy before sending messages is essential.

Even if the whenHealthy status is checked, and assuming the router is under a DoS attack, the optimal approach would still be to refrain from sending cross-chain messages to the destination chain. This is effectively the same outcome as not checking the whenHealthy status, as the messages would revert on the source chain regardless.

Again this is not true because refunds should always be available and it is possible to achieve this.

What is the actual likelihood of the Chainlink router being affected by a DoS attack? Could you provide any historical examples of such incidents?

It is documented as a possibility in the Chainlink documentation and should be taken into account as a constraint.

As per Sherlock's rules, a DoS event must persist for 7 days to be considered valid. Is there any precedent for the Chainlink router being DoS-ed for that duration?

It breaks the readme, which is more important than the 7 days ruling. And the 7 days may still be achieved anyway, unless anyone is able to predict the future. Nowhere does it say in the documentation that a cursed chain would be fixed in less than 1 week.

Even if the Chainlink router does experience a DoS attack, the transactions can be re-executed once the router is operational again, allowing the raffles to be finalized. In this case, what would the actual loss be? The winner would still be able to claim their prize on the destination chain.

Again, it breaks the readme or even the 7 days rule.

Brivan-26 commented 1 month ago

It breaks the readme, which is more important than the 7 days ruling. And the 7 days may still be achieved anyway, unless anyone is able to predict the future

Excuse me, but you can't prove that Chainlink router will be DoSed for 7 days, it never happened and the chance for this to happen is extremely low.

It breaks the readme

Aight, I feel we are playing a lot with the contest words, let's see what the README states:

Participants in a raffle that got cancelled can always get refunded

The readme is clearly saying that for a CACNELLED raffle, participants should get refunded. If Chainlink router is DoSed, cancelling the raffle is not even possible (will revert on the source chain), so the raffle is not cancelled and thus there is no need for participants to get refunded yet. I see no need to use that statement from the README here.

0xsimao commented 1 month ago

Excuse me, but you can't prove that Chainlink router will be DoSed for 7 days, it never happened and the chance for this to happen is extremely low.

The proof is that the Chainlink documentation does not say that the cursed state is fixed in less than 1 week. The fact that it has never happened does not decrease the relevance of the issue. For example, I have never been robbed at home, but I always still close the door.

The readme is clearly saying that for a CACNELLED raffle, participants should get refunded. If Chainlink router is DoSed, cancelling the raffle is not even possible (will revert on the source chain), so the raffle is not cancelled and thus the participants can't get refunded. I see no need to use that statement from the README here.

The readme assumes canceling can happen, which is not possible due to this issue. So it is written under an assumption that would be broken, which means the statement itself is also broken. Not enough tickets were bought so the raffle must be canceled, but it is not possible due to the bug.

Brivan-26 commented 1 month ago

@0xsimao You are reformulating the contest README now:

So it is written under an assumption that would be broken, which means the statement itself is also broken

The readme does not state that, it is saying for canceled raffle, funds should be refunded.

What I'm seeing here is a design improvement, there is no loss of funds because all actions (no matter whether canceling the raffle or propagating prize) can still happen after the router is live(we are supposing here the chainlink router will be even DoSed and also for 7 days). So, we apply the following Sherlock rule:

User experience and design improvement issues: Issues that cause minor inconvenience to users where there is no material loss of funds are not considered valid.

0xsimao commented 1 month ago

@0xsimao You are reformulating the contest README now:

So it is written under an assumption that would be broken, which means the statement itself is also broken

The readme does not state that, it is saying for canceled raffle, funds should be refunded.

No I am not. The readme says

Participants in a raffle that got cancelled can always get refunded

It implies that rounds can be canceled. If they can not even be canceled, it's even worse.

What I'm seeing here is a design improvement, there is no loss of funds because all actions (no matter whether canceling the raffle or propagating prize) can still happen after the router is live(we are supposing here the chainlink router will be even DoSed and also for 7 days). So, we apply the following Sherlock rule:

With the fix the readme is no longer broken nor the protocol is DoSed, this is not a design improvement.

Brivan-26 commented 1 month ago

We can go forever on this one, but repeating the same thing is pointless for me. I understand this issue has only one duplicate and you are doing your best to validate it. I'm wondering how this is the first report for such an issue about a very well-known and used protocol like CCIP and about often used function like _ccipSend.

I said it all in the previous comments, the escalation comments also said it all. I will not waste my energy here anymore. @WangSecurity I hope you will consider all the comments from this chat

0xsimao commented 1 month ago

Clarifying the always interpretation of the readme, it's obvious the temporal aspect is relevant here, or someone here reads the readme and is able to say with a straight face that if the refunds aren't issued in 1 month, the protocol would not consider it an issue? what about 1 week? and 1 day? The exact time was not specified, but we know the protocol very much cares about it. As such, they consider it an issue if the refund is not immediately available. The longer it is not available, the bigger the problem is for them. As the temporal question puts the issue in scope, and the amount of time the Chainlink router may be DoSed is unlimited, the issue is valid and addresses the protocol's concerns.

Oblivionis214 commented 1 month ago

This issue is talking about something like "Here is a crosschain protocol, if the bridge is broken, the whole system is DoSed and users funds will be locked temporarily". Please notice the root cause is not in current code, it is caused by Chainlink RMN. There is no fix if protocol wants to use CCIP.

the amount of time the Chainlink router may be DoSed is unlimited

The main concern is, it is proven above that CCIP pause == a long chain reorg. Which chain can reorg for a month?

And the statement:

Participants in a raffle that got cancelled can always get refunded

  1. Actually current protocol does not violate the statement. This statement does not says users must immediately get refunded. Users can always get refunded after the pause. Chainlink will return the tokens in some delay.
  2. I dont know why LSW is trying to assume Chainlink will not refund properly -- it has nothing to do with this contest.
0xsimao commented 1 month ago

@Oblivionis214 the fix was mentioned more than once, feel free to take a look above.

As per the readme, you are limiting the meaning of always to certainty, when there is no reason to do so. It has temporal value.

And users can never get refunded before the curse ends.

neko-nyaa commented 1 month ago

The follow rule should apply:

Chain re-org and network liveness related issues are not considered valid. Exception: If an issue concerns any kind of a network admin (e.g. a sequencer), can be remedied by a smart contract modification, the protocol team considers external admins restricted and the considered network was explicitly mentioned in the contest README, it may be a valid medium. It should be assumed that any such network issues will be resolved within 7 days, if that may be possible.

This issue is exactly about network liveness. Because it was not clear whether the Chainlink admin is considered trusted, we consider if the DoS impact is enough for a medium severity

Assuming an emergency curse actually does happen, per the rules, any network issues has to be assumed to be resolved within 7 days i.e. < 7 days, whereas per the README, the protocol functionality denial has to be at least 7 days i.e. >= 7 days.

Then the following rule on DoS apply:

Could Denial-of-Service (DOS), griefing, or locking of contracts count as a Medium (or High) issue? DoS has two separate scores on which it can become an issue:

  1. The issue causes locking of funds for users for more than a week.

  2. The issue impacts the availability of time-sensitive functions (cutoff functions are not considered time-sensitive). If at least one of these are describing the case, the issue can be a Medium. If both apply, the issue can be considered of High severity. Additional constraints related to the issue may decrease its severity accordingly.

The first point does not apply because it is less than a week, and the protocol chose not to override that value. The second point does not apply, as withdrawing tokens cannot be considered time-sensitive as mentioned in this comment

0xsimao commented 1 month ago

Chainlink may be cursed also due to the execution safety violation. Also the comment you linked has nothing to do with this issue (not even the same function). And finally the admin expects refunds to be always possible, which is not the case due to Chainlink being cursed. A summary of the issue is:

  1. Chainlink may become cursed, due to 2 reasons, either network issues or incorrect execution happened. It's impossible to predict how much time it will be halted, but it is a real possibility. If Chainlink thought this would not happen, they would not have built this mechanism in the first place. So it is silly to discard this possibility.
  2. The admins wrote that they want refunds to always be available. Due to the curse, refunds are not always available. This stands above admin trust rules or even network liveness rules or similar because the protocol intentionally placed it in the readme. As such, the issue is in scope.
  3. There is a fix solving this issue completely. This is not a matter of saying 'even if it is cursed, there is nothing we can do'. No, with this fix, the readme is never broken and users always get their refunds. Due to 1) being a possibility, 2) would be broken and 3) could fix it.