ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.11k stars 5.73k forks source link

do-while results in redundant branching instructions when compiling via IR #15442

Open Philogy opened 1 week ago

Philogy commented 1 week ago

Description

When using a do-while control flow statement such as in the following code:

contract Sum {
    function sum(uint256[] calldata nums) public pure returns (uint256 total) {
        if (nums.length == 0) return 0;
        uint256 i = 0;
        do {
            assembly ("memory-safe") {
                total := add(calldataload(add(nums.offset, shl(5, i))), total)
                i := add(i, 1)
            }
        } while (i < nums.length);
    }
}

I expect solidity to generate the relatively straight-forward & maximally efficient assembly:

tag_doWhileBodyEntry:
    <doWhileBody>
    <whileCondition>
    tag_doWhileBodyEntry
    jumpi    

This is the case when compiling with the legacy pipeline but using the IR pipeline gives you roughly the following structure:

    0x01
tag_doWhileEntry:
    iszero
    tag_whileCondition
    jumpi
tag_doWhileBody:
    <doWhileBody>
    0x00
    tag_doWhileEntry
    jump
tag_whileCondition:
    <whileCondition>
    tag_doWhileBody
    jumpi

This structure, while correct is not efficient. Digging deeper the origin of this structure becomes more apparent when looking at the generated IR:

function fun_sum(var_nums_offset, var_nums_length) -> var_total {
    var_total := 0
    if iszero(var_nums_length) {
        var_total := 0
        leave
    }
    let var_i := 0
    let _1 := 1
    for { } 1 { } {
        if iszero(_1) {
            if iszero(lt(var_i, var_nums_length)) { break }
        }
        _1 :=  0
        var_total := add(calldataload(add(var_nums_offset, shl(5, var_i))), var_total)
        var_i := add(var_i, 1)
    }
}

Environment

Steps to Reproduce

Compile the following contract with viaIR

contract Sum {
    function sum(uint256[] calldata nums) public pure returns (uint256 total) {
        if (nums.length == 0) return 0;
        uint256 i = 0;
        do {
            assembly ("memory-safe") {
                total := add(calldataload(add(nums.offset, shl(5, i))), total)
                i := add(i, 1)
            }
        } while (i < nums.length);
    }
}
cameel commented 1 week ago

Yes, I can confirm that this is how it works, but unfortunately I don't think we can do much about it, at least in the short term. This a case where Yul not having jumps makes it impossible to implement it more efficiently in a general case.

Since Yul has no native do...while loop, we simulate it with a for loop. Naively one could do it like this:

for {} true {} {
    <body>
    if iszero(<condition>) { break }
}

This should give you the assembly you want, however, it will break if you put continue in the body. The codegen translates it to Yul's continue, which will not jump to the right condition here. The legacy codegen does not have this problem, since it can use arbitrary jumps and just inserts a tag before the right condition, making continue jump there.

To deal with this problem, we generate this instead, which is the reason for the less efficient assembly you get via IR:

let firstRun := 1
for {} true {} {
    if iszero(firstRun) {
        if iszero(<condition>) { break }
    }
    firstRun := 0
    <body>
}

Since this affects only code that contains continue, one option would be to detect its presence and generate the simpler code when it's not there. But I'm not convinced that the extra special-casing is worth it as it goes hard against the general philosophy of keeping the IR codegen simple. It is technically an option, as we did it allow one such case in the past (Optimizer > Unchecked Loop Increment), but this one does not seem to be on the same level in terms of how common it is and what impact it has.

We'd rather do things like this in the optimizer, and we may eventually do it, but we have many potential optimizations in the pipeline and this one does not seem like it would be high on the priority list.

Philogy commented 1 day ago

but unfortunately I don't think we can do much about it

Why couldn't Yul be expanded/improved?

If the goal of Yul is to be simple to compile then forcing the addition of such specialized passes in the backend seems like it'd be going against that motto? Maybe wrong venue but I'd love to learn more about how the solc team thinks about language & compiler design.