code-423n4 / 2023-10-zksync-findings

3 stars 0 forks source link

QA Report #548

Open c4-submissions opened 11 months ago

c4-submissions commented 11 months ago

See the markdown file with the details of this report here.

bytes032 commented 11 months ago

10 l 7 r 22 nc

[D-01] Using msg.timestamp as an expiration timestamp invites MEV manipulations r

[D-02] Using debugLog in ∆gas calculations leads to an artificially increased value l

[L-01] Dangerous execution path l

[L-02] Code is unreachable nc

[L-03] Changing addresses sizes from 20 to 32 bytes could break some parts of the system l

[L-04] Unused variables (missing logic?) nv

[L-05] Code and comment mismatch nc

[L-06] basefee may be higher than maxPriorityFeePerGas l

[L-07] getZkSyncMeta does not populate fully the ZkSyncMeta struct r

[L-08] The front-end may return a wrong value for the L2 transaction base cost nc

[L-09] rawCall does permit calls to MsgValueSimulator with the _isSystem flag set to false (it will revert) l

[L-10] L1 ⇒ L2 transactions are not free, but bootloader behaves like they are l

[L-11] It is possible to refund a single transaction more than 2³² - 1 gas, whilst the bootloader only has that amount of gas avaiable for a given batch l

[L-12] Useless max pick l

[L-13] Misleading comment (?) nc

[L-14] Lower-bound the minimum execution delay in ValidatorTimelock r

[L-15] Wrong logic (?) r

[L-16] If _enumerationIndexSize is 0, compressedEnumIndex equals metadata r

[L-17] Tautology in publishPubdataAndClearState dup of https://github.com/code-423n4/2023-10-zksync-findings/issues/853

[L-18] Chaining hashes with the wrong initial value (?) r

[L-19] Wrong initialisation of the zero point (?) l

[L-20] Rogue validators can spam revertBlocks and make the chain unusable l

[NC-01] WTF nc

[NC-02] Missing comments nc

[NC-03] Typos nc

[NC-04] Empty error string nc

[NC-05] Wrong comments in L1 bridges nc

[NC-06] Wrong comment nc

[NC-07] Wrong comment in ecrecover/secp256k1/mod.rs nc

[NC-08] SystemContext::baseFee is not a constant, although the comment says it is nc

[NC-09] Wrong comment in Diamond nc

[NC-10] Add getters to some variables of AppStorage nc

[NC-11] validateUpgradeTransaction does return as valid non-upgrade transactions nc

[NC-12] You are not using your own cached virtualBlockInfo in SystemContext nc

[NC-13] Wrong debugLog call nc

[NC-14] Do not let users put as refundRecipient addresses in kernel space nc

[NC-15] Comment and code mismatch nc

[NC-16] Wrong comment (again) nc

[NC-17] PointProjective::mul_assign optimization nc

[NC-18] PointProjective::add_assign_mixed missing code nc

[I-01] Missing support for certain precompiles r

nethoxa commented 10 months ago

About L-19, the following test highlights the issue (it shall be a much higher severity but at the time of the contest I wasn't sure about whether I was right or not, so I put it as a Low). Here is the mathematical proof I wrote in my initial submission too (for completeness):

$$a \star O = O \star a = a, \forall a \in F$$

$$\therefore O \ is \ unique \ under \ F$$

// Put this test inside ecrecover/secp256k1/mod.rs and run from the 
// era-zkevm_circuits folder the next command
//
//              cargo test POC -- --nocapture
//
#[test]
fn POC() {
    let false_zero = PointAffine::zero();
    let real_zero = PointAffine::from_xy_checked(Fq::zero(), Fq::zero()).unwrap();

    println!("Is (0, 1) the null elememt (should not) -> {}", false_zero.is_zero());
    println!("Is (0, 0) the null elememt -> {}\n", real_zero.is_zero());
    println!("Is (0, 1) on curve (should NOT) -> {}", false_zero.is_on_curve());
    println!("Is (0, 0) on curve -> {}", real_zero.is_on_curve());
}

The point at infinity is defined as $(0, 0)$, not both $(0, 0)$ and $(0, 1)$. As it is right now, you could craft valid signatures with an invalid point or corrupt the result of other operations like PointProjective::add_assign_mixed due to:

$$\therefore (0, 1) \not \in F$$

On top of that, we have that all operations defined under $F$ apply ONLY to the elements of $F$ (by definition). Therefore, you cannot use elements outside of $F$ with functions and elements defined under $F$, as it is the case right now (it's mathematically wrong and bug prone).

// Put this test inside ecrecover/secp256k1/mod.rs and run from the 
// era-zkevm_circuits folder the next command
//
//              cargo test POC_MIXED -- --nocapture
//
#[test]
fn POC_MIXED() {
    let false_zero = PointAffine::zero();
    let real_zero = PointAffine::from_xy_checked(Fq::zero(), Fq::zero()).unwrap();
    let projective = PointProjective::one();
    println!("Point projective -> {}", &projective);
    println!("False zero -> {:?}", &false_zero);
    println!("Real zero -> {:?}\n", &real_zero);

    let mut projective1 = projective.clone() as PointProjective;
    projective1.add_assign_mixed(&false_zero);
    println!("Added (0, 1) -> {}", projective1);

    let mut projective2 = projective.clone() as PointProjective;
    projective2.add_assign_mixed(&real_zero);
    println!("Added (0, 0) -> {}\n", projective2);
}

Moreover, the implementation of this curve is widely used in many cryptographic projects, so if anyone forks this crate and uses it on his own project (for example, due to a recommendation from the own devs), they would be building something fundamentally flawed (and can damage zkSync Era's reputation if such a project gets rekt because of that).The fix is the next one:

ecrecover/secp256k1/mod.rs, line 150

    fn zero() -> Self {
        PointAffine {
            x: Fq::zero(),
-           y: Fq::one(),
+           y: Fq::zero(),
            infinity: true,
        }
    }

as seen in

ecrecover/secp256k1/mod.rs, line 199

    fn from_xy_checked(x: Self::Base, y: Self::Base) -> Result<Self, GroupDecodingError> {
        let infinity = x.is_zero() && y.is_zero();
        let affine = Self {
            x: x,
            y: y,
            infinity,
        };

        if !affine.is_on_curve() {
            Err(GroupDecodingError::NotOnCurve)
        } else {
            Ok(affine)
        }
    }
nethoxa commented 10 months ago

About NC-18, I have been testing and it is indeed a bug (shall not be a NC then as the output is wrong), as you would be returning a corrupted value that, although it would be defined as the point at infinity due to $Z = 0$, the other coordinates remain non-zero/one leading to corrupted results if it is used again (very similar to my previous message $\Uparrow$ so DRY). Here is the POC:

// Put this test inside ecrecover/secp256k1/mod.rs and run from the 
// era-zkevm_circuits folder the next command
//
//              cargo test POC -- --nocapture
//
#[test]
fn POC() {
    let mut projective = PointProjective::one();
    let mut projective_into_affine_negated = projective.clone().into_affine();
    projective_into_affine_negated.negate();
    println!("Point projective -> {}", &projective);
    println!("Point projective into affine negated -> {}\n", &projective_into_affine_negated);

    projective.add_assign_mixed(&projective_into_affine_negated);
    println!("Added A + (-A) (should be zero by definition) -> {:?}", &projective);
    println!("Zero is defined as -> {:?}", PointProjective::zero());
}

The fix is the one I have written in the QA here. However, it had a typo so the next one is the correct one:

secp256k1/mod.rs, line 486

            ...

        } else {
            // If we're adding -a and a together, self.z becomes zero as H becomes zero.

+           if self.x == u2 {
+               (*self) = Self::zero();
+               return;
+           }

            // H = U2-X1
            let mut h = u2;

            ...

as seen here (even the comments are the same)

secp256k1/mod.rs, function add_assign

            ...

        } else {
            // If we're adding -a and a together, self.z becomes zero as H becomes zero.

            if u1 == u2 {
                // The two points are equal, so we double. // @audit wrong comment, the one I'm referring is the previous one
                (*self) = Self::zero();
                return;
            }

            // H = U2-U1
            let mut h = u2;

            ...
nethoxa commented 10 months ago

About L-12 , after re-reading the docs provided in the contest repo, it seems I was wrong and the correct implementation is the one in line 1456. The flawed one is in line 918.

When bootloader calls askOperatorForRefund it will be provided with the leftover gas PLUS the reserved gas calculated by the operator off-chain (not the one in getGasLimitForTx). That is

bootloader, line 918

                    refundGas := max(getOperatorRefundForTx(transactionIndex), potentialRefund)

will always pick the one provided by the operator and adding such a value with reservedGas in

bootloader, line 921

                refundGas := add(refundGas, reservedGas)

will output refundGas + reservedGasByOperator + reservedGasByBootloader, giving users a higher gas refund than the correct one. It seems for high reservedGas values, such a transaction would always revert due to the check in line 923, so it can be seen as a DOS in those situations and a theft of gas in the others. Neverthless, consider changing the code to:

                    ...

                    // In case the operator provided smaller refund than the one calculated
                    // by the bootloader, we return the refund calculated by the bootloader.
-                   refundGas := max(getOperatorRefundForTx(transactionIndex), potentialRefund)
+                   refundGas := max(getOperatorRefundForTx(transactionIndex), safeAdd(potentialRefund, reservedGas, "whatever"))
                }

-              refundGas := add(refundGas, reservedGas)

                if gt(refundGas, gasLimit) {
                    assertionError("L1: refundGas > gasLimit")
                }

                ...

Taking into account my submission Operator can steal all gas provided by ANY user for L1 → L2 transactions and mixing the fixes.

nethoxa commented 10 months ago

UPDATE $\Rightarrow$ no need to read this comment, agree with the sponsor that this is QA even though I was misunderstood. See the previous version of this message if you want to know the idea I had in mind whilst writing this comment. Thanks.


Although D-02 can be seen as a little increase of gas (doing the maths with the Ethereum ones, note that the gas cost of the same opcodes in zkSync Era is far more due to ZK processing) and you can argue that this is a QA refactoring, take into account that the goal of zkSync Era is to increase its TPS much more than what it is right now, which is currently $7.13$ according to l2beat.

For the shake of the example, let's say $3.565$ are from L1 and $3.565$ are from L2. That means, the increased gas cost per second right now is:

$$3.565 160 + 3.565 240 = 1426 \ gas/second$$

In a whole year, the gas taken from users would be $1426 60 \ seconds 60 \ minutes 24 \ hours 365 \ days = 44970336000 \ gas$ (around $44$ US billion gas). That's a lot of gas going straight into the operators pockets "by free" as users are paying for the execution of their transaction, not for debugging functionalities. Moreover, in many other ∆gas calculations blocks like here, here, here and here they are totally excluded, which shows the developers intentions to avoid overcharging users.

Either you decide to maintain this as a QA or lift it up to Medium, as it is a theft of gas, please, move those debugLogs out of the ∆gas calculations blocks as users must pay for what they use, nothing more, nothing less. The fixes are in my QA report

nethoxa commented 10 months ago

About D-01, note that the underlying issue has been awarded medium severities in several places, either in code4rena in recent contests (see here and here) or in private audits (see here). This situation is worst as you "support" any type of transaction. Moreover, there has been other reports highlighting something similar (like this one) but has been rejected due to a lack of impact. Although I show two possible and different scenarios, users MUST be able to put a limit on when their transactions can be executed, regardless of any theoretical threat or attack.

Anyway, to show a different example from the classic swap issue, imagine a protocol that is deployed in both zkSync Era and Ethereum, is going to initiate a multi-chain airdrop from Ethereum to both its zkSync and Ethereum users. A malicious or a set of malicious operators on Etehreum can delay the transaction to initiate the airdrop on zkSync Era and sell their tokens to a higher price, as the total supply between chains would be fewer. Once the price spikes, they pass the airdrop transaction to zkSync to dilute the price and harm users who were holding those tokens, as the supply increases, so the price decreases instantly like a market crash. The deadline would stop such an attack, as the protocol could set the airdrop to last a few minutes, otherwise the zkSync airdrop wouldn't take place. This attack can be seen as a real world inflation attack, in which one country holds foreign currency for a long time and dumps all the money to the target country to flood its market and skyrocket its local monetary inflation (making an analogy between layers and countries)

The fix is the one in my QA.

miladpiri commented 10 months ago

About L-19, the following test highlights the issue (it shall be a much higher severity but at the time of the contest I wasn't sure about whether I was right or not, so I put it as a Low). Here is the mathematical proof I wrote in my initial submission too:

  • The O element is defined as the element in F that meets:

a⋆O=O⋆a=a,∀a∈F

  • Say there are two "zero-elements", namely O1 and O2, in F with O1≠O2.
  • Then, for example, we have that O1⋆O2=O2⋆O1=O1 but, as both are the "zero-element", the next O1⋆O2=O2⋆O1=O2 holds too
  • That implies O1=O2 but our initial thesis was that O1≠O2, so we have a contradiction

∴O is unique under F

// Put this test inside ecrecover/secp256k1/mod.rs and run from the 
// era-zkevm_circuits folder the next command
//
//              cargo test POC -- --nocapture
//
#[test]
fn POC() {
    let false_zero = PointAffine::zero();
    let real_zero = PointAffine::from_xy_checked(Fq::zero(), Fq::zero()).unwrap();

    println!("Is (0, 1) the null elememt (should not) -> {}", false_zero.is_zero());
    println!("Is (0, 0) the null elememt -> {}\n", real_zero.is_zero());
    println!("Is (0, 1) on curve (should NOT) -> {}", false_zero.is_on_curve());
    println!("Is (0, 0) on curve -> {}", real_zero.is_on_curve());
}

The point at infinity is defined as (0,0), not both (0,0) and (0,1). As it is right now, you could craft valid signatures with an invalid point or corrupt the result of other operations like PointProjective::add_assign_mixed due to:

  • (0,1) not being the null element (from the mathematical proof above) AND
  • (0,1) not being on the curve as it does not comply with ²³²³y²=x³+b⇒1²≠0³+7

∴(0,1)∉F

On top of that, we have that all operations defined under F apply ONLY to the elements of F (by definition). Therefore, you cannot use elements outside of F with functions and elements defined under F, as it is the case right now (it's mathematically wrong and bug prone).

// Put this test inside ecrecover/secp256k1/mod.rs and run from the 
// era-zkevm_circuits folder the next command
//
//              cargo test POC_MIXED -- --nocapture
//
#[test]
fn POC_MIXED() {
    let false_zero = PointAffine::zero();
    let real_zero = PointAffine::from_xy_checked(Fq::zero(), Fq::zero()).unwrap();
    let projective = PointProjective::one();
    println!("Point projective -> {}", &projective);
    println!("False zero -> {:?}", &false_zero);
    println!("Real zero -> {:?}\n", &real_zero);

    let mut projective1 = projective.clone() as PointProjective;
    projective1.add_assign_mixed(&false_zero);
    println!("Added (0, 1) -> {}", projective1);

    let mut projective2 = projective.clone() as PointProjective;
    projective2.add_assign_mixed(&real_zero);
    println!("Added (0, 0) -> {}\n", projective2);
}

Moreover, the implementation of this curve is widely used in many cryptographic projects, so if anyone forks this crate and uses it on his own project (for example, due to a recommendation from the own devs), they would be building something fundamentally flawed (and can damage zkSync Era's reputation if such a project gets rekt because of that).The fix is the next one:

ecrecover/secp256k1/mod.rs, line 150

    fn zero() -> Self {
        PointAffine {
            x: Fq::zero(),
-           y: Fq::one(),
+           y: Fq::zero(),
            infinity: true,
        }
    }

as seen in

ecrecover/secp256k1/mod.rs, line 199

    fn from_xy_checked(x: Self::Base, y: Self::Base) -> Result<Self, GroupDecodingError> {
        let infinity = x.is_zero() && y.is_zero();
        let affine = Self {
            x: x,
            y: y,
            infinity,
        };

        if !affine.is_on_curve() {
            Err(GroupDecodingError::NotOnCurve)
        } else {
            Ok(affine)
        }
    }

This is a valid code quality issue, but there is no impact. Summary of the issue: In affine coordinates, we use both (0,1, infinity=true) and (0,0, infinity=true) to store the point at infinity. As the warden explained, mathematically there should only be one point at infinity. However, throughout the codebase, we check if a point is infinity only by checking the boolean is true (ignoring the x,y coordinates). The warden's own test cases confirm that the code handles both points correctly, so there is no impact

miladpiri commented 10 months ago

About NC-18, I have been testing and it seems it is indeed a bug (shall not be a NC then as the output is wrong), as you would be returning a corrupted value that, although it would be defined as the point at infinity due to Z=0, the other coordinates remain non-zero/one leading to corrupted results if it is used again (very similar to my previous message ⇑ so DRY). Here is the POC:

// Put this test inside ecrecover/secp256k1/mod.rs and run from the 
// era-zkevm_circuits folder the next command
//
//              cargo test POC -- --nocapture
//
#[test]
fn POC() {
    let mut projective = PointProjective::one();
    let mut projective_into_affine_negated = projective.clone().into_affine();
    projective_into_affine_negated.negate();
    println!("Point projective -> {}", &projective);
    println!("Point projective into affine negated -> {}\n", &projective_into_affine_negated);

    projective.add_assign_mixed(&projective_into_affine_negated);
    println!("Added A + (-A) (should be zero by definition) -> {:?}", &projective);
    println!("Zero is defined as -> {:?}", PointProjective::zero());
}

The fix is the one I have written in the QA here. However, it had a typo so the next one is the correct one:

secp256k1/mod.rs, line 486

            ...

        } else {
            // If we're adding -a and a together, self.z becomes zero as H becomes zero.

+           if self.x == u2 {
+               (*self) = Self::zero();
+               return;
+           }

            // H = U2-X1
            let mut h = u2;

            ...

as seen here (even the comments are the same)

secp256k1/mod.rs, function add_assign

            ...

        } else {
            // If we're adding -a and a together, self.z becomes zero as H becomes zero.

            if u1 == u2 {
                // The two points are equal, so we double. // @audit wrong comment, the one I'm referring is the previous one
                (*self) = Self::zero();
                return;
            }

            // H = U2-U1
            let mut h = u2;

            ...

This is basically the exact same "bug" as L-19. Valid code quality but no impact. As long as the z-coordinate of a Projective Point is zero, it will be treated as the point at infinity, regardless of the x and y coordinates.

miladpiri commented 10 months ago

About L-12 , after re-reading the docs provided in the contest repo, it seems I was wrong and the correct implementation is the one in line 1456. The flawed one is in line 918.

When bootloader calls askOperatorForRefund it will be provided with the leftover gas PLUS the reserved gas calculated by the operator off-chain (not the one in getGasLimitForTx). That is

bootloader, line 918

                    refundGas := max(getOperatorRefundForTx(transactionIndex), potentialRefund)

will always pick the one provided by the operator and adding such a value with reservedGas in

bootloader, line 921

                refundGas := add(refundGas, reservedGas)

will output refundGas + reservedGasByOperator + reservedGasByBootloader, giving users a higher gas refund than the correct one. It seems for high reservedGas values, such a transaction would always revert due to the check in line 923, so it can be seen as a DOS in those situations and a theft of gas in the others. Neverthless, consider changing the code to:

                    ...

                    // In case the operator provided smaller refund than the one calculated
                    // by the bootloader, we return the refund calculated by the bootloader.
-                   refundGas := max(getOperatorRefundForTx(transactionIndex), potentialRefund)
+                   refundGas := max(getOperatorRefundForTx(transactionIndex), safeAdd(potentialRefund, reservedGas, "whatever"))
                }

-              refundGas := add(refundGas, reservedGas)

                if gt(refundGas, gasLimit) {
                    assertionError("L1: refundGas > gasLimit")
                }

                ...

Taking into account my submission Operator can steal all gas provided by ANY user for L1 → L2 transactions and mixing the fixes.

The original report was wrong, but this comment is correct. However, it does not have any impact, so it should be considered as QA. The reason why it lacks any impact is:

reservedGas = totalGasLimit - max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)) https://github.com/code-423n4/2023-10-zksync/blob/main/code/system-contracts/bootloader/bootloader.yul#L1145

and

refundGas = max(getOperatorRefundForTx(transactionIndex), potentialRefund) + reservedGas https://github.com/code-423n4/2023-10-zksync/blob/main/code/system-contracts/bootloader/bootloader.yul#L918

in order to have refundGas > gasLimit (to have DoS as mentioned by the warden) the following condition should be met:

max(getOperatorRefundForTx(transactionIndex), potentialRefund) + totalGasLimit - max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)) > gasLimit

since totalGasLimit is the same as gasLimit, we will have:

max(getOperatorRefundForTx(transactionIndex), potentialRefund) > max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex))

This is possible if the operator provides refund higher than max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)). Moreover, if the operator provides such large refund amount, the bootloader will be reverted at line 924, so nothing happens, it is kind of self-rekt for the operator. It is as if the operator provides very low gas at the beginning to run the bootloader, so the bootloader will revert, and he should run it again.

miladpiri commented 10 months ago

Although D-02 can be seen as a little increase of gas (doing the maths with the Ethereum ones, note that the gas cost of the same opcodes in zkSync Era is far more due to ZK processing) and you can argue that this is a QA refactoring, take into account that the goal of zkSync Era is to increase its TPS much more than what it is right now, which is currently 7.13 according to l2beat.

For the shake of the example, let's say 3.565 are from L1 and 3.565 are from L2. That means, the increased gas cost per second right now is:

3.565∗160+3.565∗240=1426 gas/second

In a whole year, the gas taken from users would be 1426∗60 seconds∗60 minutes∗24 hours∗365 days=44970336000 gas (around 44 US billion gas). That's a lot of gas going straight into the operators pockets "by free" as users are paying for the execution of their transaction, not for debugging functionalities. Moreover, in many other ∆gas calculations blocks like here, here, here and here they are totally excluded, which shows the developers intentions to avoid overcharging users.

Either you decide to maintain this as a QA or lift it up to Medium, as it is a theft of gas, please, move those debugLogs out of the ∆gas calculations blocks as users must pay for what they use, nothing more, nothing less. The fixes are in my QA report

This is QA at most. Yes, it is an additional cost for users. So, it is the fact that we have written DefaultAccount in Solidity & not in Yul. This is the price everyone pays for better support.

miladpiri commented 10 months ago

About D-01, note that it has been awarded medium severities in several places, either in code4rena in recent contests (see here and here) or in private audits (see here). This situation is worst as you "support" any type of transaction. Moreover, there has been other reports highlighting something similar (like this one) but has been rejected due to a lack of impact. Although I show two possible and different scenarios, users MUST be able to put a limit on when their transactions can be executed, regardless of any theoretical threat or attack.

Anyway, to show a different example from the classic swap issue, imagine a protocol that is deployed in both zkSync Era and Ethereum, is going to initiate a multi-chain airdrop from Ethereum to both its zkSync and Ethereum users. A malicious or a set of malicious operators on Etehreum can delay the transaction to initiate the airdrop on zkSync Era and sell their tokens to a higher price, as the total supply between chains would be fewer. Once the price spikes, they pass the airdrop transaction to zkSync to dilute the price and harm users who were holding those tokens, as the supply increases, so the price decreases instantly like a market crash. The deadline would stop such an attack, as the protocol could set the airdrop to last a few minutes, otherwise the zkSync airdrop wouldn't take place. This attack can be seen as a real world inflation attack, in which one country holds foreign currency for a long time and dumps all the money to the target country to flood its market and skyrocket its local monetary inflation (making an analogy between layers and countries)

The fix is the one in my QA.

expirationTimestamp is not used anywhere. It is set, but it is not responsible for any logic currently. It will be used in the future implementations. This report can be used in Analysis report.

nethoxa commented 10 months ago

This is a valid code quality issue, but there is no impact. Summary of the issue: In affine coordinates, we use both (0,1, infinity=true) and (0,0, infinity=true) to store the point at infinity. As the warden explained, mathematically there should only be one point at infinity. However, throughout the codebase, we check if a point is infinity only by checking the boolean is true (ignoring the x,y coordinates). The warden's own test cases confirm that the code handles both points correctly, so there is no impact


But it is wrong. You can just check the boolean, that's true, but it will return true for both of them, as the tests and the mathematical proof highlight. All signatures and some ZK things rely on the correctness of this curve and the UNIQUENESS of the points in there. If you can craft two different points that are gonna be treated as the same one, even if one of them is not on the curve, then you can replay the original one with the wrong one. When you say

The warden's own test cases confirm that the code handles both points correctly, so there is no impact

the code must not accept $(0, 1)$. The result is wrong. $(0, 1)$ is not on the curve and it is not the zero element. They must revert like in from_xy_checked. Moreover, it is the purpose of is_on_curve function to return false for those points and prevent the parent function from going on with an invalid point. Math structures like this curve and all the operations defined under it must be correctly implemented for a reason. Right now, from a formal POV, this is not the real secp256k1 curve. It mimics how it works, but it is not the same. Furthermore, it is not even a field, which is the basis of everything you are trying to do with this crate.

I would like to put this comment here, as it is a very similar situation and that issue has been accepted as a medium even with null impact.

nethoxa commented 10 months ago

This is basically the exact same "bug" as L-19. Valid code quality but no impact. As long as the z-coordinate of a Projective Point is zero, it will be treated as the point at infinity, regardless of the x and y coordinates.


In fact, it is worst as every point added with its negated will be treated as the zero element. That means you can have infinite zero elements (up to the prime field) who all are DIFFERENT from each other and from $(0, 1, 0)$ floating around.

nethoxa commented 10 months ago

The original report was wrong, but this comment is correct. However, it does not have any impact, so it should be considered as QA. The reason why it lacks any impact is:

reservedGas = totalGasLimit - max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)) https://github.com/code-423n4/2023-10-zksync/blob/main/code/system-contracts/bootloader/bootloader.yul#L1145

and

refundGas = max(getOperatorRefundForTx(transactionIndex), potentialRefund) + reservedGas https://github.com/code-423n4/2023-10-zksync/blob/main/code/system-contracts/bootloader/bootloader.yul#L918

in order to have refundGas > gasLimit (to have DoS as mentioned by the warden) the following condition should be met:

max(getOperatorRefundForTx(transactionIndex), potentialRefund) + totalGasLimit - max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)) > gasLimit

since totalGasLimit is the same as gasLimit, we will have:

max(getOperatorRefundForTx(transactionIndex), potentialRefund) > max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex))

This is possible if the operator provides refund higher than max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)). Moreover, if the operator provides such large refund amount, the bootloader will be reverted at line 924, so nothing happens, it is kind of self-rekt for the operator. It is as if the operator provides very low gas at the beginning to run the bootloader, so the bootloader will revert, and he should run it again.


You say

This is possible if the operator provides refund higher than max(MAX_GAS_PER_TRANSACTION(), getOperatorTrustedGasLimitForTx(transactionIndex)).

which is correct and I did not take into account, so the DOS part is invalid. However, the theft of gas from the operator by users (which you did not answer in your response) is correct as the bootloader is checking what is higher, the calculated refund for the execution of the transaction or the value provided by the operator, which is returning HIS reserved gas plus the leftover gas he has found for such a transaction. That means the operator's value will always be picked and being added with reserved gas will output more gas than intended. The code responsible for that in the zkSync Era node is in

refunds.rs, line 253

            ...

            let tx_gas_limit = state
                .memory
                .read_slot(
                    BOOTLOADER_HEAP_PAGE as usize,
                    tx_description_offset + TX_GAS_LIMIT_OFFSET,
                )
                .value
                .as_u32();

            ...

            let tx_body_refund = self.tx_body_refund(
                bootloader_refund,
                gas_spent_on_pubdata,
                tx_gas_limit,
                current_ergs_per_pubdata_byte,
                pubdata_published,
            );

            ...

If we go to tx_body_refund we see that the refund provided by the operator is, roughly, $gasLimitFromUser - gasSpentInComputation - gasSpentInPubdata$, that is, the excess of gas provided by the user, or, in other terms, the refund of gas after the execution step plus the reserved gas:

refunds.rs, function tx_body_refund

    pub(crate) fn tx_body_refund(
        &self,
        bootloader_refund: u32,
        gas_spent_on_pubdata: u32,
        tx_gas_limit: u32,
        current_ergs_per_pubdata_byte: u32,
        pubdata_published: u32,
    ) -> u32 {
        let total_gas_spent = tx_gas_limit - bootloader_refund;

        let gas_spent_on_computation = total_gas_spent
            .checked_sub(gas_spent_on_pubdata)
            .unwrap_or_else(|| {
                tracing::error!(
                    "Gas spent on pubdata is greater than total gas spent. On pubdata: {}, total: {}",
                    gas_spent_on_pubdata,
                    total_gas_spent
                );
                0
            });

        // For now, bootloader charges only for base fee.
        let effective_gas_price = self.l1_batch.base_fee();

        let bootloader_eth_price_per_pubdata_byte =
            U256::from(effective_gas_price) * U256::from(current_ergs_per_pubdata_byte);

        let fair_eth_price_per_pubdata_byte =
            U256::from(eth_price_per_pubdata_byte(self.l1_batch.l1_gas_price));

        // For now, L1 originated transactions are allowed to pay less than fair fee per pubdata,
        // so we should take it into account.
        let eth_price_per_pubdata_byte_for_calculation = std::cmp::min(
            bootloader_eth_price_per_pubdata_byte,
            fair_eth_price_per_pubdata_byte,
        );

        let fair_fee_eth = U256::from(gas_spent_on_computation)
            * U256::from(self.l1_batch.fair_l2_gas_price)
            + U256::from(pubdata_published) * eth_price_per_pubdata_byte_for_calculation;
        let pre_paid_eth = U256::from(tx_gas_limit) * U256::from(effective_gas_price);
        let refund_eth = pre_paid_eth.checked_sub(fair_fee_eth).unwrap_or_else(|| {
            tracing::error!(
                "Fair fee is greater than pre paid. Fair fee: {} wei, pre paid: {} wei",
                fair_fee_eth,
                pre_paid_eth
            );
            U256::zero()
        });

        ceil_div_u256(refund_eth, effective_gas_price.into()).as_u32()
    }

So, my statements were correct and this is a theft of gas from the operator's payout by users.

nethoxa commented 10 months ago

Regarding this answer from the sponsor, I was misunderstood and gave a proper answer in the previous version of this message, but as it does not harm users too much, so QA is fine, I have "removed" it to not overload this discussion. If you are interested, see the previous version. Thanks.

nethoxa commented 10 months ago

expirationTimestamp is not used anywhere. It is set, but it is not responsible for any logic currently. It will be used in the future implementations. This report can be used in Analysis report.


I know what you mean by that and I would like to say that I'm not talking about deposits staying idle or not being executed in zkSync Era due to high congestion in Ethereum for days, I am talking about time-sensitive L1 $\Rightarrow$ L2 transactions via requestL2Transaction (there are projects using them like the ones I mentioned in my report).

The issue is that it's fine to put that "hard-coded" deadline for deposits, as the "environment changes" they suffer from are minimal, but there is no way for a contract in Ethereum to set a deadline of, say, a few minutes, for the execution of his transaction on zkSync Era, leading to the impacts I mentioned above. Neither it is possible now nor when you actually implement the logic regarding the withdrawals without operator's help.

I focused on expirationTimestamp as it is wrong to assume that msg.timestamp equals the time the user submitted the transaction to the network and when the transaction is actually picked and executed on-chain, which affects ALL L1 $\Rightarrow$ L2 transactions, whether they are simple deposits or not, and the fact it will be used for all types of transactions, whether the caller wants that deadline or not.

As the intention as well as the "Ethereum counterpart" is implemented in the code in-scope for this contest, plus the real threat rogue nodes in Ethereum are for the network as stated in the real-world examples I mentioned in my report (not hand-wavy hypotheticals) and the precedence of other recent issues judged differently than what you state, please, reconsider it, as users must be able to put the deadline they want/need for their transactions (otherwise transactions going to zkSync Era from Ethereum will become prime target of MEV bots and rogue nodes, due to the "sequential determinism" of your L1 $\Rightarrow$ L2 communication channel)

lsaudit commented 10 months ago

Hey, I'm not a judge, but out of curiosity, I took a look at your QA. It's awesome. I've skimmed through a few issues.

D-01 - imho, this is definitely more than QA, other code4rena reports treat most MEV scenarios as Medium. This issue basically means that there are multiple scenarios in zksync environment which may make MEVs attack possible. As your recommendation easily fixes that problem, imho - it deserves Medium :)

D-02 - I don't agree with the assumption that user "must NOT pay for debugging". I think it should be evaluated rather as Refactoring, instead of Low.

L-01 - imho it should be classified as Medium. According to C4 severity categorization: funds at risk with a hypothetical attack path with stated assumptions

L-03 - while I agree with the reasoning, this does not pose any security threat at all in the current implementation. As you mentioned in the issue's description, your issue is related to the comment, that devs plan to introduce 32-bytes address space in the future. It's not introduced yet, thus we cannot evaluate the current code as vulnerability basing on the predictions and future plans. While this is a valid observation, I'd stick with classifying this issue as Refactoring, and not Low (because it does not pose any risk right now).

L-06 - I agree, that there's a missing check, however, this check is missing, because maxPriorityFeePerGas is hardcoded to 0. It's also mentioned in your report. While this value is hardcoded to 0, lack of this check does not pose any security threat at all. When this value will be changed - devs will need to add additional (missing) check. Since, again, this issue is related to some predictions, this should be classified as Refactoring, and not Low, imho

L-09 - isn't it just a matter of incorrect comment? In that case it should be NC, instead of Low?

L-11 - hmm, in both cases there's already assertion: gt(refundGas, gasLimit) and gt(refundInGas, getGasLimit(innerTxDataOffset)), having (or not having) additional check which verifies if it's uint32 does not change anything, does it? In that case, it has no security impact and should be classified rather as Refactoring, than Low.

L-14 - is there any reason why you want to limit the timelock, especially if it's onlyOwner? Open Zeppelin's TimelockController.sol also does not implement any lower-bound for the execution delay, thus imho this is not valid

L-19 - i also don't know anything about affine spaces :) - but if this true, the impact is much higher than QA. It'd be nice if someone took another look on it, since this might be more than QA. Imho, it should be either invalid or Medium at least!

L-05, L-13, NC-02, NC-05, NC-06, NC-07, NC-08, NC-09, NC-15, NC-16 - all are related to incorrect comment, shouldn't it be treated as one issue, instead of 10 separate ones?

NC-12 - this is gas-saving-related issue, it should be moved to the Gas report and not be evaluated in QA

I-01 - this is a known issue, docs already mentions which precompiles are available in zksync: https://era.zksync.io/docs/reference/architecture/differences-with-ethereum.html#precompiles

nethoxa commented 10 months ago

Thanks @lsaudit for your comment. Please, read the discussion with the sponsor regarding some of the issues you have mentioned as I have provided more info and my initial QA was a shit. About some of your comment:

GalloDaSballo commented 9 months ago

Using msg.timestamp as an expiration timestamp invites MEV manipulations L Using debugLog in ∆gas calculations leads to an artificially increased value NC, it's acceptable as long as paid by usr Dangerous execution path TODO MED? Code is unreachable R Changing addresses sizes from to bytes could break some parts of the system I, insufficient proof Unused variables (missing logic?) R Code and comment mismatch L basefee may be higher than maxPriorityFeePerGas TODO getZkSyncMeta does not populate fully the ZkSyncMeta struct NC The front-end may return a wrong value for the L2 transaction base cost I rawCall does permit calls to MsgValueSimulator with the _isSystem flag set to false (it will revert) NC L1 ⇒ L2 transactions are not free, but bootloader behaves like they are TODO It is possible to refund a single transaction more than 2³² gas, whilst the bootloader only has that amount of gas avaiable for a given batch L Useless max pick R Misleading comment (?) NC Lower-bound the minimum execution delay in ValidatorTimelock I Wrong logic (?) L If _enumerationIndexSize is 0, compressedEnumIndex equals metadata L Tautology in publishPubdataAndClearState TODO Chaining hashes with the wrong initial value (?) L Wrong initialisation of the zero point (?) I Rogue validators can spam revertBlocks and make the chain unusable L WTF R Missing comments NC Typos NC Empty error string NC Wrong comments in L1 bridges L Wrong comment NC Wrong comment in ecrecover/secp256k1/mod.rs NC SystemContext::baseFee is not a constant, although the comment says it is L Wrong comment in Diamond NC Add getters to some variables of AppStorage R validateUpgradeTransaction does return as valid non-upgrade transactions L You are not using your own cached virtualBlockInfo in SystemContext R Wrong debugLog call R Do not let users put as refundRecipient addresses in kernel space L Comment and code mismatch L Wrong comment (again) I PointProjective::mul_assign optimization NC PointProjective::add_assign_mixed missing code NC Missing support for certain precompiles I Some ways to exploit this, from a hacker's POV, is via rogue validators (see (https://furucombo.app/combo) combo worth million dollars or any other type of highly-sensitive inter-chain operation. TODO In (https://github.com/matter-labs/era-contracts/blob/f06a58360a2b8e7129f64413998767ac169d1efd/ethereum/contracts/zksync/ValidatorTimelock.solL109C1-L109C111), which is supposed to be non-zero. As the validator is currently controled by Matter Labs, I'm putting it as a Low but consider implementing a delay or a slashing mechanism to validators that try to abuse their privileges by spamming revertBlocks for batches that do not benefit them, or even try to DOS the network. TODO (https://github.com/code-423n4/2023-10-zksync/blob/72f5f16ed4ba94c7689fe38fcb0b7d27d2a3f135/code/era-zkevm_circuits/src/keccak256_round_function/mod.rsL93) ⇒ MEMORY_QUERIES_PER_CYCLE, not MEMORY_EQURIES_PER_CYCLE. TODO

L - 12 NC, it's acceptable as long as paid by usr - 1 TODO MED? - 1 R - 7 I, insufficient proof - 1 TODO - 6 NC - 11 I - 5

12 L 7R 11 NC - TODO Finalize

c4-judge commented 9 months ago

GalloDaSballo marked the issue as grade-a

c4-judge commented 9 months ago

GalloDaSballo marked the issue as selected for report

c4-judge commented 9 months ago

GalloDaSballo marked the issue as grade-c

c4-judge commented 9 months ago

GalloDaSballo marked the issue as grade-a

c4-judge commented 9 months ago

GalloDaSballo marked the issue as selected for report