ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.06k stars 5.72k forks source link

Unchecked block as an expression. #10698

Closed MicahZoltu closed 1 year ago

MicahZoltu commented 3 years ago

Abstract

Make it so unchecked { ... } blocks are an expression that return the the result of the last expression if the last thing in the block is an expression.

Motivation

Unchecked blocks are a bit unwieldy at the moment because you have to fully isolate the code inside the unchecked block from the code around it which makes it awkward to deal with. It would be nice to be able to use unchecked blocks inline, rather than having to fully isolate the unchecked code from the rest of your code.

Here are some examples of behavior that would be nice, and their counterpart in the current version of Solidity:

// proposed
for (uint8 i; i < 10; unchecked { ++i }) { ... }
uint256 apple = unchecked { banana + cherry };
uint256 durian = apple + unchecked { banana + cherry } - eggplant;

// current
for (uint8 i; i < 10;) { unchecked { ++i } ... }
uint256 apple;
unchecked { apple = banana + cherry; }
uint256 temp;
unchecked { temp = banana + cherry; }
uint256 durian = apple + temp - eggplant;

Specification

When the last statement in an unchecked block is an expression, the unchecked block becomes an expression and the return value of the last statement in the block is returned from the unchecked block.

Consider: Maybe this behavior only occurs when the last statement doesn't have a trailing ; as a way to make this a bit more explicit?

Backwards Compatibility

I believe this is a feature that is purely widening, so no backward compatibility issues should exist.

Other

In general, it would be nice if all blocks were expressions, like for loops if blocks, etc. If it is just as easy to make all blocks expressions then we can adjust this issue, but I wanted to start narrow and widen if there is desire from those with a better understanding of the internals.

leonardoalt commented 3 years ago

I was writing tests for https://github.com/ethereum/solidity/pull/10661 today and really missed this feature.

cameel commented 3 years ago

I think this would make the language nicer to use. When I was discussing the match statement for #909 with @christianparpart some time ago, we both thought that it it would be best off as an expression (like in functional languages or Rust) and, extending that even further, it would be a good thing if every block actually worked that way.

ekpyron commented 3 years ago

While I'm not necessarily saying that unchecked expressions are a bad idea in general, I'd very much emphasize that I think that for(uint i = 0; i < 100; unchecked { i++ }) is not a valid argument for this. As far as I'm concerned this should never be written. The idea behind unchecked was clearly that it's only valid use is in cases like bitwise arithmetic in cryptographic subroutines, nowhere else.

That being said, in retrospective, it is unfortunate that we enabled checked arithmetic by default before the optimizer will be able to remove all redundant checks as the one in such a loop (it will at some point) - that's of course frustrating and I can understand being dissatisfied by that. But it still (at least on its own) doesn't warrant a language change like this.

There is two zero-cost workarounds for the loop post expressions that are better than general unchecked expression:

function unsafe_inc(uint x) returns (uint) {
    unchecked { return x + 1; }
}

contract C {
    function f() public {
        for(uint x = 0; x < 10; x = unsafe_inc(x))
        {
        }
    }
}

library L {
    function unsafe_inc(uint x) internal returns (uint) {
        unchecked { return x + 1; }
    }
}

contract D {
    using L for uint;

    function f() public {
        for(uint x = 0; x < 10; x = x.unsafe_inc())
        {
        }
    }
}

Which could of course be generalized to a full UnsafeMath library.

So if we don't have good independent reasons for allowing blocks as expressions or cases that would not be solved by such an UnsafeMath library, I vote against this.

MicahZoltu commented 3 years ago

Will the compiler (currently) inline unsafe_inc in both cases? I thought it would include a jump and some stack manipulation?

ekpyron commented 3 years ago

Will the compiler (currently) inline unsafe_inc in both cases? I thought it would include a jump and some stack manipulation?

Hm, yeah, the old optimizer actually probably won't, that's a good point...

ekpyron commented 3 years ago

Will the compiler (currently) inline unsafe_inc in both cases? I thought it would include a jump and some stack manipulation?

Hm, yeah, the old optimizer actually probably won't, that's a good point...

Maybe in 0.8.1 it will, though, we're just considering it ;-).

ekpyron commented 3 years ago

Will the compiler (currently) inline unsafe_inc in both cases? I thought it would include a jump and some stack manipulation?

Hm, yeah, the old optimizer actually probably won't, that's a good point...

Maybe in 0.8.1 it will, though, we're just considering it ;-).

Well, 0.8.1 apparently was overly optimistic, especially since we need to collaborate with tooling in order for it not to break debugging... but it looks like it'll come eventually and maybe we'll get it sorted just in time for 0.8.2 at least.

jacob-eliosoff commented 3 years ago

I was writing tests for #10661 today and really missed this feature.

Not a huge deal, but I missed this feature today too (@MicahZoltu's apple/durian usage above). Making it possible to modify expressions makes it easier to write functional aka pure (or at least expression-based rather than stateful) code, which I think should generally be encouraged...

jacob-eliosoff commented 3 years ago

Doing some gas golfing today and I really missed this feature again... Too many cases where doing the gas-optimal thing turns one line of code into three.

hrkrshnn commented 2 years ago

I think we should think about this again. I would bring this up in the next design call.

frangio commented 2 years ago

As a data point, here is some recent code where lack of unchecked expressions harms readability:

function pushFront(Bytes32Deque storage deque, bytes32 value) internal {
    int128 frontIndex;
    unchecked {
        frontIndex = deque._begin - 1;
    }
    deque._data[frontIndex] = value;
    deque._begin = frontIndex;
}

(source)

Assuming we don't want to wrap everything in the unchecked block, which is good practice IMO,

  1. Because of scope we need to declare the variable outside the block.
  2. The extra indentation arguably makes it harder to read the code.

Here is what it could look like with an unchecked expression:

function pushFront(Bytes32Deque storage deque, bytes32 value) internal {
    int128 frontIndex = unchecked { deque._begin - 1 };
    deque._data[frontIndex] = value;
    deque._begin = frontIndex;
}
hrkrshnn commented 2 years ago

@frangio I saw that code yesterday! Also have been seeing similar uses in other codebases. I think we badly need this feature in the language.

chriseth commented 2 years ago

I see the point, but I think we should not have curly braces at expression level.

Your example is very interesting: You are using int128 values that always start out with 0 and can only be incremented and decremented. Because of these access restrictions to the type, it can never overflow with finite gas.

There is a comment along these lines in the documentation of the library, but isn't this an insight that should rather be coupled to the type instead of the library?

If you have a type that only defines the functions inc(), dec(), reset() and distance(x, y) and does not make use of wrap and unwrap, you don't need any unchecked except inside the functions defined on the type. You can even use the type as a key in the mapping.

The only function in the library that is not totally trivial after that is the at() function, which needs a function on the type that is a bit weird: distanceFromBegin(uint256 index, CustomType begin, CustomType end) pure returns (CustomType)

This function is essentially require(begin + index < end); return begin + index;. It returns a safe value as long as begin <= end . Your current implementation has some SafeCasts, so maybe you can add require(begin <= end) as well. Because of these checks, the function already does the bounds checking for you that you need anyway.

Yeah, maybe I'm taking this a bit too far here...

edit: distanceFromBegin returns CustomType

frangio commented 2 years ago

I see the point, but I think we should not have curly braces at expression level.

Yeah I thought about this as well. Other syntax would be fine too.


It's a good observation that we don't need unchecked if we define a new datatype with those guarantees. It's actually the reason why we have a Counter type, although that was introduced before user defined value types so it doesn't have the stronger guarantees.

Some of our tooling doesn't support user defined value types just yet so I don't think we want to implement it that way for now.

jacob-eliosoff commented 2 years ago

I'm sure there are tradeoffs I don't fully appreciate here but from a simple Solidity code monkey's pov, a lot of unchecked code I've written (egs in our WadMath lib - like @frangio's example) would be a lot nicer with unchecked expressions, even using curly braces. To the point where I'm sometimes reluctant to save the gas because it gratuitously adds several lines to my code, which I hate.

Even better would be unchecked versions of the individual operators: something like x +u y as the unchecked version of x + y. (I realize +u isn't actually a doable syntax.) It's really the individual operation I want to make a decision about, not a block.

So my 2c is, if a nicer solution comes to mind (eg along the lines of +u), great, but if you have to choose between allowing curly-braced expressions and the status quo, I'd vote for the curly braces.

jacob-eliosoff commented 2 years ago

Just to make my "It's the operation I want to make a decision about, not the block" comment more concrete: I can envision cases where the status quo code is:

uint a;
unchecked { a = b + c; }
a *= d;
unchecked { a -= e; }

And with unchecked expressions it would be:

uint a = unchecked { b + c; }
a *= d;
unchecked { a -= e; }

But with the +u-type operators I described it could be:

uint a = (b +u c) * d -u e;

This might require scratching our heads for a workable syntax but I mean shit, that last version is so much cleaner/more compact than the other two.

chriseth commented 2 years ago

Yeah, I would very much prefer the different operator than the curly braces at expression level. I think we actually discussed that at some point.

chriseth commented 2 years ago

Just for fun, I implemented the idea suggested in https://github.com/ethereum/solidity/issues/10698#issuecomment-1043221780 here: https://github.com/solidity-external-tests/openzeppelin-contracts/commit/497cdc908960f2b0c0b72a56c1eb88cee3daa565

MicahZoltu commented 2 years ago

For the "alternative operator" we could use some extended characters. That would make it very difficult to accidentally screw up.

‡⁑¬⁜※– for some ideas (note that last one is not -.

saurik commented 2 years ago

‡⁑¬⁜※– for some ideas (note that last one is not -.

@MicahZoltu OK, you've finally goaded me into the conversation: do not under any circumstance implement this feature using a character that looks almost identical to - (including ¬) but somehow isn't actually - as I can essentially guarantee to you that this will be used to purposefully hide bugs in contracts :/.

MicahZoltu commented 2 years ago

That is reasonable. Another option would be to go full emoji:

➗✖️➕➖

Unfortunately (something that makes me want to flip tables), Microsoft has a bug that makes ➕ look identical to ✖️ when using their default font (https://emojipedia.org/multiply/). /tableflip

jacob-eliosoff commented 2 years ago

Yeah what @saurik said, doppelganger characters are absolutely not the solution, we need something that's legible and unambiguous in ASCII. @alcueca suggested "!" for unchecked , etc?

jacob-eliosoff commented 2 years ago

...Or I guess to me +!, -!, *!, /! etc are the most intuitive? Would those break existing code? Eyeballing the operators currently affected by unchecked I can't see any that are already valid with a "!" appended (or prepended) - I may have missed one...

The following operators will cause a failing assertion on overflow or underflow and will wrap without an error if used inside an unchecked block:

++, --, +, binary -, unary -, *, /, %, **

+=, -=, *=, /=, %=

cameel commented 2 years ago

@jacob-eliosoff You mean something analogous to the non-null assertion operator in TypeScript? E.g. +! would be checked and revert (just like var! fails in TS when var is null) while + would be an unchecked addition?

jacob-eliosoff commented 2 years ago

Well, kind of, but I was thinking the reverse: + is checked, +! is unchecked (the exclamation mark says "You must do it!"). But I'm open to whatever syntax really, just want an expression rather than a statement.

cameel commented 2 years ago

Ah, ok. But then I'm not sure if that's really intuitive. TS people will likely expect the +! version to revert while others may expect + to revert as it does now so this syntax is actually pretty ambiguous.

jacob-eliosoff commented 2 years ago

I'll leave that debate to bigger language gurus than me, though whoever on the TypeScript side decided that adding an exclamation mark would make the operation less likely to succeed should feel very, very ashamed

frangio commented 2 years ago

@cameel I think you may be confused as to how ! works in TypeScript. The operator doesn't cause the compiler to emit any extra runtime checks, it rather tells the compiler to statically assume that the variable is not null. It may result in a runtime error down the road if this assumption is wrong, because the JavaScript runtime is typed.

+! could be seen analogously as a way to tell the compiler that it can assume the operation will not overflow, so it can omit the runtime check. But this is slightly different from the TypeScript/JavaScript case because you could say the EVM runtime is "untyped" in the sense that if the assumption is wrong and there is overflow, there will not necessarily be a runtime error.

I still think this is close enough though, +! feels like reasonable syntax for unchecked arithmetic to me.

MicahZoltu commented 2 years ago

I'm a fan of +!. To me, the ! operator means, "I know what I'm doing, don't try to protect me from myself." This is true in TypeScript, where the user is asserting to the compiler "I'm confident that this thing is not null, please don't try and be helpful by stopping me from using it in contexts where non-null is required" and in Solidity it would mean, "I'm confident this math operation will be fine at runtime, please don't add additional runtime checks to validate this for me".

cameel commented 2 years ago

Checked math operations in Solidity are assertions too. They result in a Panic just like a failed assert(), division by zero or out of bounds array access. If you do x + y, you're asserting that the result won't overflow. With x +! y you would simply not be asserting anything about the operation. And just like with ! if your assertion is false, all bets are off, you probably have a bug.

The problem is that most language users see checked math as proper input validation. Which is a whole different discussion.

That being said, I agree that the +! syntax is neat and my biggest concern is the potential for confusion. I wish we could just use +! for checked stuff and normal operators for unchecked. That would be breaking though.

jacob-eliosoff commented 2 years ago

I still definitely find it more intuitive for the ! to mean "Skip the safety check" rather than "Do the safety check". Just my 2c though. Agreed that making the !-free operators unchecked would be breaking - of course that wouldn't have been true before 0.8!

cameel commented 2 years ago

Let's wait for more opinions and see. I mean, the fact that many people here don't see a problem with +! as unchecked might mean that it's good enough after all and my concerns are overblown :)

jacob-eliosoff commented 2 years ago

That's the spirit! I'll really just be glad to have the operators, I'll get used to whatever syntax. Appreciating the discussion here.

MicahZoltu commented 2 years ago

I wish we could just use +! for checked stuff and normal operators for unchecked.

Coming from TypeScript, I would find that very confusing.

christianparpart commented 2 years ago

@cameel to me, having an operator +! looks really not so good, we'd also need that suffix to other operators as well. And that would just over-complify things. I'm highly advocating for introducing a unchecked { ... } . Its's way easier to read to me at least and less obfuscated as introducing special operators like +!.

cameel commented 2 years ago

I'm in the operator camp :) Being checked or unchecked is ultimately something that applies to a specific operation so making it granular made more sense to me from the beginning. I mean something like (a +! b) *! c *! d /! e**!f does not look great but it's good that it's a bit ugly. That's something that should stand out.

Which, incidentally, is also good argument for using ! or unchecked rather than checked operations despite what I said above...

cameel commented 2 years ago

Actually, one problem I see is that using ! makes things ambiguous because ! on its own is an operator too.

That's why my old idea was to use @. But a @+ b is a bit bulky...

jacob-eliosoff commented 2 years ago

Oh dear, I think you're right @cameel: +! is ambiguous because a+!b could mean either "unchecked a plus b" or "checked a plus not-b".

This seems like an argument for !+ over +!, since I believe ! is a prefix operator, not postfix (ie, a! on its own is not valid Solidity, unlike !a). Though I find !+ a lot less intuitive... Ah well...

jacob-eliosoff commented 2 years ago

And just to echo two other points @cameel made:

  1. Yeah I strongly favor the unchecked operator over the unchecked block. See examples above: it's usually more compact and precise to make the decision about the individual operation (though the option of larger unchecked blocks may also be worth keeping around?). It's true that it leads to distinct versions of many operators, but the decision should be made about each operation.
  2. I agree that the more obvious, unmodified operator (+) should be the checked version, with the less obvious one (eg !+) unchecked. People should have to think to do an unchecked operation, they should be checked by default.
frangio commented 2 years ago

Are there any concrete plans for this feature?

nventuro commented 1 year ago

Are there any concrete plans for this feature?

jason-gigastar commented 1 year ago

Variations of +! seems fine to me since the two operators are only valid in mutually exclusive contexts:

  1. ! boolean operation (NOT or inequality)
  2. + non-boolean operation

Related scenarios that yield errors in solidity 0.8.17:

bool a = true +! false; // Operator + not compatible with types bool and bool
bool b = 1 +!= 2; // Error: Expected primary expression
bool c = 1 +!== 2; // Error: Expected primary expression

I prefer the +! style to unchecked {...} as I find the former more concise and easy to use/understand, but maybe we can have both?

Given the amount of code that uses i++ when ++i is intended, surely those same people will probably have trouble with the +! operator but I don't think ignorance should be the reason to avoid the operator approach if we can only have 1 technique.

My background is primarily C++ but I've written in most major languages.

cameel commented 1 year ago

Are there any concrete plans for this feature?

Not in the very short term. We're being pulled in too many directions lately and currently we're trying to bring back focus to the big roadmap items. Things like this will unfortunately have to wait for their turn. Having said that, we have some things in the pipeline that might somewhat alleviate the pain here.

The first one is a hard-coded optimization for simple cases involving the loop counters. This already has a pull request: #13378.

For the general case, we'll soon be releasing custom operators and suffixes (#13718) so you'll be able to bypass those checks on user-defined value types, which should make frequent use of unchecked unnecessary in many cases as you'll be able to bake that into the type:

type UncheckedCounter is uint;

using {
    add as +,
    lt as <
} for UncheckedCounter global;

function add(UncheckedCounter x, UncheckedCounter y) pure returns (UncheckedCounter) {
    unchecked {
        return UncheckedCounter.wrap(UncheckedCounter.unwrap(x) + UncheckedCounter.unwrap(y));
    }
}

function lt(UncheckedCounter x, UncheckedCounter y) pure returns (bool) {
    return UncheckedCounter.unwrap(x) < UncheckedCounter.unwrap(y);
}

function cycles(uint c) pure suffix returns (UncheckedCounter) {
    return UncheckedCounter.wrap(c);
}

contract C {
    uint total = 0;

    function testCounter() public {
        for (UncheckedCounter i = 12 cycles; i < 20 cycles; i = i + 1 cycles)
            total += i.unwrap();
        }
    }
}
cameel commented 1 year ago

Variations of +! seems fine to me since the two operators are only valid in mutually exclusive contexts:

This seems like a bit of a hack to me. It requires making assumptions about types (which we only know later, during analysis) at parser level to disambiguate the syntax. And with custom operators (see https://github.com/ethereum/solidity/issues/13718#issuecomment-1341058649) this assumption will be broken very soon anyway. In the next release you'll be able to define both + and ! on user-defined value types.

Not saying it's impossible to resolve, but it may introduce more problems than it's worth.

nventuro commented 1 year ago

I'll admit, I had to do a double take because I wasn't sure if you were joking. Is that really the vision the solc team has for the future of the language? It seems insane to me to have so much stuff going on for a simple addition (!!!) with no extra checked arithmetic. The whole suffix and cycles bit seems particularly awful.

In a context where clarity and explicitness are paramount, surely the extremely obvious unchecked { ... } proposed solution is a better fit?

I understand wanting to increase expressiveness, and how such constructs could potentially be used to this effect. But this doesn't seem to be a great example of this being achieved.

jacob-eliosoff commented 1 year ago

@cameel's point about +! probably being too fragile makes sense to me. I think !+, though a tad uglier, is more future-proof and probably fine.

I don't agree that user-defined types are a good solution to this particular problem. A lot more people will use it, and more code will end up safely and efficiently checked or unchecked as appropriate, if these are just additional included operators.

But, I can totally understand the team has many possible changes to prioritize. I think "checked + by default, !+ if you want unchecked" is the best solution to the initial problem (which I encountered often), but none of this is crucial! Probably the only crucial thing is to avoid introducing something we'll regret. (And that the more obvious operator, eg +, remains the checked one.)

@nventuro, for limitations of the unchecked { ... } expression solution see eg my and @chriseth's comments. I am just a guy but @chriseth is not to be blown off so lightly.

cameel commented 1 year ago

I understand wanting to increase expressiveness, and how such constructs could potentially be used to this effect. But this doesn't seem to be a great example of this being achieved.

Sorry, I think I wasn't completely clear here. I didn't mean to say this is the way to handle loops with an unchecked counter now, just showing that this will now be possible. I guess I have to agree that this is not a great example. I wanted to show something along the lines of https://github.com/ethereum/solidity/issues/10698#issuecomment-1043221780 as @jacob-eliosoff mentioned. It still falls short because the new feature does not provide the full encapsulation @chriseth described that would actually guaratee that the counter cannot overflow in practice.

The operators/suffixes are really meant for implementing custom types. Being able to make the operations checked/unchecked is more of a side-effect.

frangio commented 1 year ago

The potential features discussed here are interesting but have perhaps unnecessarily turned a simpler problem into a much larger and more difficult one. Please see alternative proposal in https://github.com/ethereum/solidity/issues/13799.

ekpyron commented 1 year ago

To cut this discussion short, even though I'm aware that this won't make everyone happy: I'm closing this as not planned, since we're commited to solve this on the type-level instead, as hinted at by @cameel. I.e. we will be transitioning towards having both types on which arithmetic is checked, and types on which arithmetic is unchecked, both of which can be defined in principle in user-code, resp. be imported from our planned standard library (while the current checked-arithmetic types will remain available until then). In the long-term unchecked blocks will probably even vanish completely this way.

Note that independently, for for-loops in particular in which the check in the increment is provably by redundant by construction, we will in the mid-term not even generate the redundant checks in the first place (along the lines of https://github.com/ethereum/solidity/issues/13308), solving this concrete issue separately.

If you want to further discuss this, feel free to open a post in the forum.