golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.15k stars 17.69k forks source link

cmd/vet: add math check for erroneous math expressions #30951

Open dr2chase opened 5 years ago

dr2chase commented 5 years ago

Go constants can be extended to support -0, NaN, +Inf, and -Inf, as constant values. There is no special name for any of these constants; this is support for values, not new tokens. The model for this is mythical "IEEE Reals", which is the combination of Real numbers (not floating point) with the rules for IEEE -0, NaN, +Inf, and -Inf added where they fit, in accordance with the rules for IEEE floating point.

Why should we do this?

  1. It brings us closer to no-surprises support for IEEE floating point. I would say "zero surprises" except that there are still cases where large-precision constant arithmetic provides different answers; I think we handle that using the "IEEE real numbers" explanation.
  2. A corollary of the first, is that numerically oriented people and code coming from other languages will experience one fewer surprise or speed bump in their use of Go. Sure, the workaround is easy, but it's not as easy as not needing to know it.
  3. It brings constant evaluation slightly closer to run-time evaluation of Go programs; one more difference/surprise is avoided.
  4. It breaks zero programs except for tests targeted specifically to this behavior.
  5. I have never, ever, heard of support for these constant values causing problems for anyone in any other language that supports them.

The following conversion-from-constant-value rules are added: when converting to integer, -0 becomes 0, and NaN and Inf become errors. When converting to floating point (or component of a complex number), the obvious mapping to IEEE floats occurs. In the case of tiny negative fraction underflow when converting to float, -0 results; in the case of positive or negative overflow on conversion to float, +Inf or -Inf results, as appropriate.

This arithmetic rules are enumerated in the following tables, which were checked against the IEEE specification and run-time arithmetic evaluation in both Java (expected to conform to IEEE) and Go (since this commit for 1.13 , believed to conform to IEEE).

Legend:

R = real numbers
P = { x in R such that x > 0} (but not +Inf)
N = { x in R such that x < 0} (but not -Inf)
Z = 0
NZ = -0
PI = positive infinity
NI = negative infinity
NaN = not-a-number

In the tables below, "R" as a result means to use the rules for real numbers. "RZ" means use the rule for real numbers with NZ replaced by Z.

addition P N Z NZ PI NI NaN
P R R R RZ PI NI NaN
N R R R RZ PI NI NaN
Z R R R Z PI NI NaN
NZ RZ RZ Z NZ PI NI NaN
PI PI PI PI PI PI NaN NaN
NI NI NI NI NI NaN NI NaN
NaN NaN NaN NaN NaN NaN NaN NaN
subtraction (column - row) P N Z NZ PI NI NaN
P R R R RZ NI PI NaN
N R R R RZ NI PI NaN
Z R R R Z NI PI NaN
NZ RZ RZ NZ Z NI PI NaN
PI PI PI PI PI NaN PI NaN
NI NI NI NI NI NI NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN
multiplication P N Z NZ PI NI NaN
P R R Z NZ PI NI NaN
N R R NZ Z NI PI NaN
Z Z NZ Z NZ NaN NaN NaN
NZ NZ Z NZ Z NaN NaN NaN
PI PI NI NaN NaN PI NI NaN
NI NI PI NaN NaN NI PI NaN
NaN NaN NaN NaN NaN NaN NaN NaN
division (row/column) P N Z NZ PI NI NaN
P R R PI NI Z NZ NaN
N R R NI PI NZ Z NaN
Z Z NZ NaN NaN Z NZ NaN
NZ NZ Z NaN NaN NZ Z NaN
PI PI NI PI NI NaN NaN NaN
NI NI PI NI PI NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN
remainder (row REM column) P N Z NZ PI NI NaN
P R R NaN NaN x x NaN
N R NZ NaN NaN x x NaN
Z Z Z NaN NaN Z Z NaN
NZ NZ NZ NaN NaN NZ NZ NaN
PI NaN NaN NaN NaN NaN NaN NaN
NI NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN
equality P N Z NZ PI NI NaN
P R R R RZ F F F
N R R R RZ F F F
Z R R R RZ F F F
NZ RZ RZ RZ RZ F F F
PI F F F F T F F
NI F F F F F T F
NaN F F F F F F F
less than (row < column) P N Z NZ PI NI NaN
P R R R RZ T F F
N R R R RZ T F F
Z R R R RZ T F F
NZ RZ RZ RZ RZ T F F
PI F F F F F F F
NI T T T T T F F
NaN F F F F F F F
ianlancetaylor commented 5 years ago

Currently constant division by zero is a compile time error. So there is no way to write a constant expression that produces a floating point infinity or a NaN. So I assume you are suggesting that the language be changed such that a constant division by zero produces infinity or NaN rather than a compile-timer error.

dr2chase commented 5 years ago

Correct, that is my intent. Division by [-]0 yields NaN or +Inf or -Inf, depending on the dividend.

MichaelTJones commented 5 years ago

In this same arena would be a standard way to specify quiet and signalling NaNs, and to specify/test the 51-bit NaN payload for float64s. If you want to add this to the proposal that seems natural, if not, I could open a separate proposal.

dr2chase commented 5 years ago

Do other programming languages support this in their constant arithmetic? Those seem more like run-time or library changes to me, and I'm not sure what Go would do with a signalling NaN (panic, probably).

My main goal here is a net increase in adoption and usability -- don't change constants in a way that makes them radically different from what they were (harder to learn/explain), and make the changes align with existing run time behavior, floating point standards, and what other languages do for constants, so that there's actually less to learn overall.

griesemer commented 5 years ago

The notion of "exact" (for all practical purposes) constants in Go has hugely simplified dealing with numeric constants in code, because arithmetic behaves (essentially) like it would in math. We do not distinguish between 0 and 0.0 as long as we don't assign those values to a variable. If we start to distinguish between -0.0 and 0.0, do we also distinguish between -0 and 0? We would have to, for consistency. But x := -0; y := float64(x) would not produce the same y value as y := -0.0 anymore, which is going to be harder to explain than what we have now.

Having written my share of floating-point code I agree that it would be convenient to write x := -0.0 and get a negative zero float64. But it's rare, and it's trivial to write zero := 0.0; x = -zero. After all, such things concern people who are intimately aware of floating-point subtleties anyway, so they can handle this. We shouldn't bother the vast majority with extra complexity by introducing a negative zero into our constants arithmetic.

Needless to say that adding inf or nan is even less warranted in my mind. The benefit of keeping it simple by far outweighs the very rare convenience of being able to write a constant rather than having to "compute" a value at runtime. Let's not fall for the lure of just a "tiny bit of extra complexity".

I am against this proposal.

dr2chase commented 5 years ago

I propose that in constant land -0 and -0.0 both convert to the same float (-0.0).

griesemer commented 5 years ago

@dr2chase So you're saying -0 is always -0.0 (i.e., has a default type of float64)? So in x := -0 the type of x is float64?

dr2chase commented 5 years ago

-0 has a default type of integer, but float64(-0) === float64(-0.0) (using === to distinguish from Go or IEEE == rules). I think that's simpler and a consistent extrapolation of the Go constant rules to include these additional values.

griesemer commented 5 years ago

So you're saying that an -0 behaves like a 0 int but it becomes a -0.0 in floating-point context. Agreed that that would be easy enough to do. Internally (compiler) we'd have to keep track of the sign of zero integers, which is something we don't do at the moment. We'd also have to incorporate the sign rules in all integer operations that produce a zero, for consistency, so they behave "correctly" in floating point context where they might be a negative zero. More annoying.

But that still adds extra complexity because now we have to explain all the combinations of operations involving negative zeros in constant arithmetic (even if inf and nan are ignored). And not just for floating-point constants but all numeric constants since they might be used in floating-point context. I don't even want to think about the rules for complex numbers.

This adds real complexity for little benefit. I maintain that the majority of programmers don't need to care about negative zeroes, and making Go constants follow IEEE rules only makes life complicated.

The reason for -0.0 in IEEE floating-point computations is that floats may underflow to 0. Knowing that it's -0 provides some residual information (the value is negative, but so close to zero that we can't represent it anymore). There's no such thing as underflow in Go constants.

MichaelTJones commented 5 years ago

Robert’s arguments are regrettably persuasive. I dream on the “just perfect” solution for many things but the counterpoint argument about “will it be confusing on average” is powerful. My exposure to confusions in GoNuts has made me far more appreciative of such matters. (Imagine If presence of synchronization primitives made a struct impossible to pass by value, for example)

On Fri, Apr 19, 2019 at 11:19 AM Robert Griesemer notifications@github.com wrote:

So you're saying that an -0 behaves like a 0 int but it becomes a -0.0 in floating-point context. Agreed that that would be easy enough to do. Internally (compiler) we'd have to keep track of the sign of zero integers, which is something we don't do at the moment. We'd also have to incorporate the sign rules in all integer operations that produce a zero, for consistency, so they behave "correctly" in floating point context where they might be a negative zero. More annoying.

But that still adds extra complexity because now we have to explain all the combinations of operations involving negative zeros in constant arithmetic (even if inf and nan are ignored). And not just for floating-point constants but all numeric constants since they might be used in floating-point context. I don't even want to think about the rules for complex numbers.

This adds real complexity for little benefit. I maintain that the majority of programmers don't need to care about negative zeroes, and making Go constants follow IEEE rules only makes life complicated.

The reason for -0.0 in IEEE floating-point computations is that floats may underflow to 0. Knowing that it's -0 provides some residual information (the value is negative, but so close to zero that we can't represent it anymore). There's no such thing as underflow in Go constants.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/30951#issuecomment-484976260, or mute the thread https://github.com/notifications/unsubscribe-auth/AB4DFJNBTX56C7VQJPK532LPRIEMPANCNFSM4G76NVUQ .

--

Michael T. Jonesmichael.jones@gmail.com michael.jones@gmail.com

ianlancetaylor commented 5 years ago

Another possibility along these lines would be simply disallow negative floating point zero as an untyped constant. That would avoid confusing people who write x = -0.0 or var x float64 = -0.

griesemer commented 5 years ago

I like @ianlancetaylor's suggestion. It is unlikely, but it may brake existing code (albeit it would be trivial to fix by removing the -). Perhaps we should start with a vet check?

dr2chase commented 5 years ago

A vet check would be okay, that would at least prevent people who intended to get IEEE -0.0 from being surprised (assuming that they use vet).

bcmills commented 5 years ago

assuming that they use vet

If we add the check to the suite run during go test, that would cover a substantial fraction of code.

rsc commented 5 years ago

I'm OK with landing a vet check for floating-point constant -0 at the start of Go 1.14, and we can see what happens. I wonder if instead of adding a new check specifically for that, we should start a more general 'math' vet check that could absorb the current 'dumb shift' checks as well.

rsc commented 5 years ago

Retitled to make this about adding a new 'math' vet check. That will include:

I think at the same time we should investigate adding checks for 2^n as well. (See the recent https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90885.)

smasher164 commented 5 years ago

Mind if I take a crack at this? I have a working analysis pass to check for negative floating point zero, but seeing as it's my first vet check, I'm not sure if it's too lenient or too exhaustive. Additionally, would it be better to send a separate CL that moves the oversized shifts pass into the math check?

Edit: I now have the 2^n check implemented as well. Now I'm not so sure if its possible to absorb the shift checks without violating backwards compatibility.

gopherbot commented 5 years ago

Change https://golang.org/cl/197937 mentions this issue: go/analysis: add math pass

dmitshur commented 4 years ago

This issue is currently labeled as early-in-cycle for Go 1.15. That time is now, so friendly ping. If it no longer needs to be done early in cycle, that label can be removed.

smasher164 commented 4 years ago

Moving to Go1.16 both because we've entered the freeze, and the CL still needs to be reviewed and revised.