Open bcmills opened 5 years ago
Note that part (1) would be trivial if we had operator overloading, and part (3) could really just be a vet
check.
The assignability rule could perhaps be eliminated from the proposal without causing too much damage. That would make the code much more verbose, but not fundamentally alter its structure.
Similarly, the conversion check might be implementable using generic constructor functions, and can be implemented (inefficiently and without type-safety) using reflection.
IMO, the interesting part of this proposal is part (2), the aggregation of overflow checks to the whole expression. That would otherwise require constructing an overflow-tracking object, using its methods in place of the existing arithmetic expressions, and remembering to check its result at the end — a lot of syntax, and much more error-prone than a regular old , ok
.
How come the unbounded
package only has a signed integer? An unsigned integer would be beneficial for some things I have in mind.
@ericlagergren An unbounded signed integer type can represent all of the unsigned values too, and is closed under subtraction.
In contrast, an unsigned type carries the risk of overflow: what would the behavior be for unbounded.Uint(0) - unbounded.Uint(1)
? The whole point of the unbounded
package is that its behavior on overflow is “not possible by definition”.
I’ll admit I haven’t read this too carefully, but I hope you’ll permit me some questions anyway:
wrapped
? How does it differ from what we have now?saturating
? (I think this was discussed elsewhere as well.)checked
as a namespace, but ISTM we could dispense with unbounded
in favor of a new predeclared type, like integer
or infinint
(joking).
- Why have
wrapped
? How does it differ from what we have now?
The wrapped
types have the same semantics as the builtin types today.
The wrapped
package exists so that we can remove the arithmetic operators from the built-in types.
Otherwise, it's much too easy to forget to convert a value to checked
and end up with (undiagnosed) wrapping behavior where you really meant for the operations to never overflow.
- In the same vein, what do you think about adding
saturating
? (I think this was discussed elsewhere as well.)
I think that would be fine. I don't know of many cases where it's actually useful, but it doesn't seem actively harmful either.
- I see the value in package
checked
as a namespace, but ISTM we could dispense withunbounded
in favor of a new predeclared type, likeinteger
orinfinint
(joking).
Indeed. I don't feel strongly about that either way — I'm happy to go with whatever makes the proposal more palatable to the folks making the final decision. 🙂
I'm sorry I haven't read everything carefully but how exactly is this backwards compatible?
Unchecked arithmetic operations on builtin integer types are a compile-time error.
var x int32 = 1<<31 - 1 y := x + 1 // compile-time error: `x + 1` is not checked for overflow
EDIT: I guess it's not intended to be? Is there any idea of how much existing code this would break?
Unchecked arithmetic operations on builtin integer types are a compile-time error.
If I'm reading this correctly essentially every existing Go program would become invalid, because they all have arithmetic operations on builtin integer types. That seems like a very heavy lift.
@ianlancetaylor do you consider backwards-incompatible features which are trivially accommodated by go fix to be problematic? (Not sure whether this particular concept is easily go-fixable...)
Running go fix
can certainly help, but it's not a panacea. We also have to consider all existing books, documentation, tutorials, etc.
I'd prefer to see plain int
become arbitrary-precision (#19623). Usually when I'm writing code that uses integers I don't want to choose whether I want it them to wrap or panic on overflow. I'd rather just get the (mathematically) correct result by default without having to type unbounded.Int
everywhere to get it, and use the fixed-size types - either the existing ones or the ones in this proposal - if I need to optimise.
Even in the latter case, I feel like this proposal would make bounded arithmetic code overly verbose.
As an aside, why disallow XOR and complement on unbounded ints? Assuming they're signed, aren't both operations well-defined and guaranteed to yield a finite result?
If I'm reading this correctly essentially every existing Go program would become invalid, because they all have arithmetic operations on builtin integer types.
Probably every program, but not every package: in particular, packages that use range
loops rather than indices for iteration would be mostly unaffected. (The point of the breaking change is to prompt code owners to make an explicit decision about the overflow behavior they intend.)
Also note that it would be possible to adopt parts (1) and (2) of this proposal — the new packages and checked assignment — without the breaking change of part (3). That would provide a much smaller benefit, but at a much smaller corresponding cost.
I'd prefer to see plain
int
become arbitrary-precision (#19623).
Note that that proposal does not comply with the migration strategy outlined in https://github.com/golang/proposal/blob/master/design/28221-go2-transitions.md#language-redefinitions.
As an aside, why disallow XOR and complement on unbounded ints? Assuming they're signed, aren't both operations well-defined and guaranteed to yield a finite result?
Interesting question. Mostly I think it's weird to apply two's-complement arithmetic when the sign bit is infinitely far away, but I suppose it is still well-defined. Perhaps they would be ok after all.
It’s basically a big.Int, and that has XOR.
I am aginst part 3 of the proposal, since that would take us too far. But I am in favor of part 1 and 2, particularly for checked types. I don't care too much about wrapping types. Now, if we combine part 1 and 2 with my ranged types proposal here: https://gist.github.com/beoran/83526ce0c1ff2971a9119d103822533a, then we could write:
// Package checked defines integer types whose arithmetic operations and conversions
// panic on overflow.
package checked
type Uint8 = range uint8[0:255]
type Uint16 = range uint16[0:65535]
type Int8 = range int8[-127:127]
type Int16 = range int16[-32767..32768]
So, with my proposal, that will include the result, ok :=
form, the checked part of this proposal becomes a package that is user-implementable. And, arguably, this approach is more useful because you can also have checked types with a narrower range of values.
@beoran I like the generalization to ranged types, but note that there is one subtle mismatch between this proposal and yours. Your proposal requires:
For all ranged types, whenever possible the compiler checks at compile time that any assignments to a variable of a ranged type respects either the bounds or is one of the enumerated values, and emits a compile error if the value of a variable or constant can be proven to be not in range.
This proposal explicitly does not: the overflow check in this proposal is always dynamic, because that's much easier to implement portably, and it is possible that the part of the code in which the overflow occurs may not actually be reachable (for example, because it is on a code path that is only reachable on an architecture with a different-sized uintptr
).
I see, ease of implementation is also an important element. I am willing to change my proposal to allow the compiler to make the check always dynamic for an assignment to a variable in a function.
For an assignment to a constant, the range check should always be performed at compile time.
I not sure how the range check should be done for package level variables. Sure, a panic is possible but might be confusing.
I amended the proposal. Now I propose the range check is on compile type for constants, and run time for variables, although the compiler may still do the range checking at compile time (implementation defined). I also detailed the spec of len(range), worked out other details more. I also disallowed complex underlying types for range types and for range boundaries since for complex64 and complex128, the meaning of "range" becomes quite complex, indeed.
I would argue that the comma-ok idea suggested here, part 2 of the proposal, should not be done initially. I argue this on two points. First, I think that few people will actually want that feature. Second, for those people who really do want it, they can already write it by using a deferred function that calls recover
. So I think it's appropriate to simply postpone that part of the proposal indefinitely, until we have a clearer understanding of how often would want to use it.
I would be ok with omitting the , ok
form for now.
I think its major use-cases are for down-casting checked
types (for which an explicit range check isn't a lot of extra code) and for omitting double-conversions when writing isolated arithmetic expressions in mostly-non-arithmetic (or mostly-checked) blocks of code, but the alternatives using explicit conversions for those use-cases don't seem terrible.
runtime
package:This is why I feel ranged types are an important feature of Ada and other Wirth-like languages: they help writing correct programs without overflow. As it stands now, in Go, like other C inspired languages, it is still too easy to accidentally cause overflow. Go would do well to adopt range types to prevent this.
My immediate impression is that having logical shifts not effect the ok
in checked assignments is... well I don't mean that personally, but I had a very visceral "this will negatively impact people" reaction that made me really want to call it "asinine".
It is inconsistent with the semantics/"feel" of every other operation on integers that can cause logical result bits outside the size of the integer type, in a non-obvious way that I guarantee people will accidentally forget.
(From all my experience carefully paying attention to how human cognition interacts with code, I predict we'd see non-negligible amounts of situations where people accidentally intuitively rely on qux, ok := foo << bar
to set ok
to false
if foo
has significant ones shifted out for unsigned integers too. I would bet a lot of money on it if I could be guaranteed that we would know about it whenever it happens.)
Also, if we wanted to do a logical shift while checking that we shifted bits out, we'd have to do our own before-doing-the-operation checks. But if instead logical shift did effect the ok
in checked assignment, and we wanted to ignore if we shifted bits out, the worst case is we'd just have to pull shift subexpressions out into their own assignments. So the worst case code impact seems way worse with logical shift not effecting the ok
of checked assignment.
Maybe that doesn't matter because most code just doesn't do the kind of bit twiddling that would make this matter, and it looks from the other comments that the checked assignment part of the proposal is maybe already on the chopping block anyway, but I wanted to voice that.
// Package unbounded defines an arbitrary-precision integer type with unbounded range.
// Unbounded types do not support bitwise XOR, complement, or clear operations.
XOR was discussed earlier (and since math/big.Int has, I think this should have it too).
Why no clear? Once again, math/big.Int has this: https://golang.org/pkg/math/big/#Int.SetBit
I would argue that the comma-ok idea suggested here, part 2 of the proposal, should not be done initially. I argue this on two points.
I’d like to push back on this. I think the comma-ok idea is very useful, even if just by itself.
First, I think that few people will actually want that feature.
I think more folks would use checked arithmetic were it more approachable. There isn’t any easy way to do it right now.
You can use math/bits, but it returns the results in the opposite order you’d expect for checked multiplication: (hi, lo) vs (x, ok). And it doesn’t make your intentions clear.
You can use a third-party package, but now you’re writing x, ok := checked.Add(a, b) which gets very tiresome if you have more than just two integers, like a + b c / d. (math/bits suffers from this as well.) Worse, it makes the code more difficult to read, and one of Go’s strengths is its readability. I’ve written a lot* of math/big code and even I frequently lose my place trying to follow code with more than a little arithmetic.
Second, for those people who really do want it, they can already write it by using a deferred function that calls
recover
.
I think using defer and recover to handle overflow is too clever and too subtle.
It also makes the code more difficult to follow since defer blocks are almost always far away from the code causing the panic. It’ll be difficult to handle some overflows in one way, and some in others.
Additionally, using defer and recover has a much more significant performance impact than comma-ok. Open-coded defers are pretty cheap, but they’re not free. However, comma-ok is free: the only cost is checking the overflow flag, which you have to do anyway with the defer recover strategy.
This looks interesting.
Would there be a way, with Go 1, to build code which makes integer overflows panic ? Or should I just try to parse AST and add checks before each arithmetic operations ?
Can this be implemented as a simple bound checks? It could be more versatile since it should allow users to specify their custom bound range.
It don't even have to be new types (though I like Beoran's suggestion), just few functions should do it, and no need to change how the language works.
Simplified idea:
// Define all supported integer types
type Integer interface { ~int | ~uint | ~int8... }
// Integer operations, will set `ok` to `false` if `result` is out of specified bound (via `min` and `max`)
func BoundedAdd[I Integer](a I, b I, min I, max I) (result I, ok bool)
func BoundedMinus[I Integer](a I, b I, min I, max I) (result I, ok bool)
func BoundedMultiply[I Integer](a I, b I, min I, max I) (result I, ok bool)
func BoundedDivide[I Integer](a I, b I, min I, max I) (result I, ok bool)
func...
// Handy function, panics if `ok` == false
func Must(result I, ok bool) (result I)
This might also allow us to mix different Integer types, and operate on them safely:
func Cast[I Integer, O Integer](in I) (out O, ok bool)
// BoundedAdd how takes two different integer types `I` and `O`, and output the result as a `O` type value
func BoundedAdd[I Integer, O Integer](base O, add I, min O, max O) (result O, ok bool)
EDIT: It's basically the same as the one proposed in 60858 and 24853. Should have checked before posting, sorry.
While this is an interesting idea, we can't realistically make such a large breaking change to the behavior of int. A similar argument is made regarding arbitrary precision ints in https://github.com/golang/go/issues/19623#issuecomment-2460950122. By contrast, issue #30613 proposes a potential path forward for checked ints that does not change the behavior of existing programs.
Therefore, this is a likely decline. Leaving open for four weeks for final comments. — @findleyr for the language proposal review group
This proposal is intended to accomplish the same goals as #19624, but in a way that complies with the migration strategy outlined in https://github.com/golang/proposal/blob/master/design/28221-go2-transitions.md#language-redefinitions.
See #19624 for background.
Summary This proposal would:
checked
,wrapped
, andunbounded
, containing integer types with distinguished semantics to be defined in the language spec., ok
form (as described in #19624) to explicitly check for overflow in expressions of any bounded numeric type.New packages
The three new packages each provide a set of integer types:
Defined types
Defined types that have
checked
,wrapped
, orunbounded
types as their underlying type have the same operations and behavior as the underlying type.Defined types that have builtin integer types as their underlying type have the same behavior as the underlying type, except that they are not mutually-assignable with
checked
orwrapped
types.Checked assignment
A checked assignment uses the form
x, ok = <expression>
orx, ok := <expression>
, where<expression>
can comprise any number of arithmetic, bitwise, logical, and/or conversion operations yielding achecked
,wrapped
,unbounded
, user-defined, or builtin integer type. The operations in a checked assignment do not panic even if they involvechecked
or builtin integer types. Theok
result, which may be assigned to any boolean type, indicates whether any arithmetic operation or conversion within the expression overflowed. Thex
result has the type of<expression>
and a value computed using two's-complement wrapping.Bitwise operations and logical shifts do not overflow, and therefore do not set the
ok
result. (Recall that signed shifts are defined to be arithmetic, while unsigned shifts are defined to be logical.)Unchecked arithmetic operations on builtin integer types are a compile-time error.
Conversion
Any integer type can be explicitly converted to a
checked
,wrapped
, orunbounded
integer type. Achecked
,wrapped
, orunbounded
integer type can be converted to a builtin type only if either the conversion is checked or the destination type can represent all possible values of the source type.An unchecked conversion to a
checked
type panics if the value cannot be represented in the destination type.A conversion to a
wrapped
type wraps if the value cannot be represented in the destination type, even if the operand is of a largerchecked
type.A conversion to
unbounded.Int
from any integer type always succeeds. The conversion is applied after the operand is fully evaluated.Assignability
Each sized type in the
checked
andwrapped
package is mutually assignable with the corresponding builtin sized type, but not with the type in the opposing package, nor with defined types of any underlying integer type.This allows functions to perform operations on
checked
orwrapped
types, but to expose and use the corresponding builtin types at API boundaries (with less syntactic overhead for all involved).Arithmetic operators
The type of an arithmetic expression depends on its operands. If both operands are of
checked
,wrapped
,unbounded
, or defined types, then they must have exactly the same type. If one operand is achecked
,wrapped
, orunbounded
integer or a defined type with one of those types as its underlying type, and the other is either a builtin integer or untyped constant assignable to the first, then the result of the operation is thechecked
,wrapped
,unbounded
, or defined type.If both operands are of a builtin integer type or a defined type with a builtin integer as its underlying type, the expression must be a checked assignment, bitwise operator, or logical shift, and its result is the same type as the operands.
Constants
An untyped integer constant can be assigned to any
checked
,wrapped
, orunbounded
integer type that can represent its value.Unfortunately, in order to comply with https://github.com/golang/proposal/blob/master/design/28221-go2-transitions.md#language-changes, the inferred type of a variable initialized from an untyped constant must remain the built-in
int
.