ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.47k stars 5.8k forks source link

Stop the war on for loops #13308

Open fulldecent opened 2 years ago

fulldecent commented 2 years ago

Abstract

Add a specific overflow check optimization that will make many for-loop optimizations unnecessary.

Motivation

Too many people are saying this is bad:

for (uint256 mintCounter = 0; mintCounter < quantity; mintCounter++) {
    _mint(msg.sender, _assembleTokenID(dropID, mintCounter));
}

compared to this:

for (uint256 mintCounter = 0; mintCounter < quantity;) {
    _mint(msg.sender, _assembleTokenID(dropID, mintCounter));
    unchecked{
        mintCounter++
    }
}       

Specification

Note

For reference, a for-loop like this:

for (A; B; C) {D}

is implemented like this:

10 A

100 if !B GOTO 999
110 D
120 C
130 GOTO 100

999 EXIT

New optimization

If inside a single code unit, if:

  1. a variable has bit width X,
  2. the variable is compared to be less than some quantity with bit width <=X,
  3. the variable is not set until step 4 here, then
  4. the variable is incremented;

then that incrementation in step 4 need not be safety checked. Or in $\LaTeX$:

$$x < a <= w \implies x+1 <= w$$

Documentation

People like being fancy, so even if this is implemented, they will still use the unchecked "optimization" until it can be clearly explained that it is unnecessary.

So the documentation should be updated to specify that the gas cost of this code:

contract C {
    event E;
    constructor(uint256 count) {
        for (uint256 i = 0; i < count; i++) {
            emit E(i);
        }
    }
}

shall not exceed that of this code:

contract CUnchecked {
    event E;
    constructor(uint256 count) {
        for (uint256 i = 0; i < count;) {
            unchecked {
                emit E(i);
                i++;
            }
        }
    }
}

And this can be checked with a test case.

There is precedent in this with JavaScript where the specification requires that implementations implement tail-end recursion optimization.

Backwards Compatibility

n/a


Inspired by the discussion with @FrankNFT-labs at https://github.com/LightArtists/light-smart-contracts/issues/2

fulldecent commented 2 years ago

The second optimization is:

If inside a single code unit, if:

  1. an unsigned variable is compared to be greater than some value,
  2. the variable is not set until step 3 here, then
  3. the variable is decremented,

then that decrementation in step 3 need not be safety checked. Or in $\LaTeX$:

$$x > a >= 0 \implies x-1 > = 0$$

migoldfinger commented 2 years ago

I see where this comes from, my first thought however is an fancy endles loop thats been created like that if for loops are optimized with your idea.

uint256 quantity = 100000;
for (uint8 mintCounter = 0; mintCounter < quantity; mintCounter++)
{
    _mint(msg.sender, _assembleTokenID(dropID, mintCounter));
}

This would result in an infinite loop if I am not mistaken. Of cource the unchecked code example will too but there it is the devs fault entirely.

fulldecent commented 2 years ago

@migoldfinger That code example would NOT be optimized.

The rule is (emphasis added):

  1. a variable has bit width X,
  2. the variable is compared to be less than some quantity with bit width <=X,
  3. ...

The rule would not be applied because the "some quantity's" bit width (i.e. uint256) is not less than or equal to the variable's bit width (i.e. uint8).


Any counterexample, regardless of how farfetched, is welcome if it would cause this optimization to be behavior-changing.

ekpyron commented 2 years ago

Yeah, this is a known issue - we have experimental research towards general reasoning- and SMT-based optimizations that are intended to be powerful enough to remove the overflow checks in these (and more general) cases without special-case handling - but since it's unlikely that this will be production-ready in the immediate future, we could indeed consider a dedicated and well-defined special-case optimization for this.

The alternative we usually suggest is to define an iterator type as user-defined value type, which has only an increment operation and thereby cannot overflow by design and implement its increment using unchecked arithmetic - especially since we'll probably have user-defined operators on user-defined value types soon (I'd estimate 0.8.17), this may not affect readability (too) negatively (unfortunately we won't have ++ at least in the first version, though, so it'd have to be i = i + 1).

Even without user-defined value types, you can already now write:

function unsafe_inc(uint i) pure returns (uint) { unchecked { return i + 1; } }
contract C {
    event E;
    constructor(uint256 count) {
        for (uint256 i = 0; i < count; i = unsafe_inc(i)) {
            emit E(i);
        }
    }
}

which should have equal cost to the version with the increment moved into the loop body and which is arguably at least a bit less horrible.

All that as a rough description of the status quo. Given that none of the available solutions are at the same time overly nice and expected to be production ready in the short term, it does make sense to consider a dedicated optimization step with well-defined guarantees as suggested, though.

If not that, we should at least prominently document the currently available best-practice alternatives. I thought we had that in the docs, but on a quick search right now, I didn't actually see it.

ekpyron commented 2 years ago

We had an initial discussion about this today and decided that we want to further explore a few options.

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to inactivity for the last 90 days. It will be automatically closed in 7 days.

fulldecent commented 1 year ago

Unstale.

This one is important.