Closed jfcg closed 5 years ago
If people would just learn to write them like this:
if 'a' <= c && c <= 'z' ...
or, for exclusion:
if c < 'a' || 'z' < c
the benefits of this suggestion would be very small.
I believe the rule here would be that a cmpop1 b cmpop2 c
is syntactic sugar for tmp := b; a cmpop1 tmp && tmp cmpop2 c
. Here the cmpop
operators may be any of <
, <=
, >=
, >
.
@robpike While that looks enticing, keeping the variable on the left side is often the most common thing to do, and switching between having it on the left and right, especially in a single expression, makes me second/third guess myself on my "greater than"s and my "less than"s.
I still am unsure if I like this proposal, while I've definitely thought "man, a language where you could chain comparisons would be cool" (I did not know python did this), I don't think it quite fits into Go.
I can say that when chained even further (not sure this is really a use case), it definitely is easier to understand:
x := foo() < bar() > baz() < buzz()
that may be hard to understand, but this is harder in my eyes:
var x bool
barVal := bar()
if foo() < barVal {
bazVal := baz()
if barVal > bazVal {
if bazVal < buzz() {
x = true
}
}
}
Granted, complexity should grow vertically, not horizontally. I also could have possibly made it a bit clearer if I had done something like if foo() >= barVal { break block }
which would get rid of the "indentation hell" problem.
Either way, it seems like a nifty feature, but it's not a need-to-have
I think it is best to allow chaining for monotone relations. x < y >= z is not what it looks like. It could be misread as an interval check. Interval checks would become really comfortable to read with chaining. Anything beyond that does not fit Go well, I think.
There is another advantage related to IEEE floats. Think of a typical function checking its parameters:
func Myfun1(x float64) {
if x < L1 || x >= L2 { // check for [L1, L2)
// report bad param
return
}
...
}
This kind of early return is good practice & very common. Now check this out:
package main
import (
"fmt"
"math"
)
var L1, L2 = 1.2, 4.5
var cand = []float64{
math.Inf(-1), L1 - .7, L1, (L1 + L2) / 2,
L2, L2 + 1, math.Inf(1), math.NaN()}
func main() {
// testing [L1, L2)
fmt.Println("Testing for [", L1, ",", L2, ")\n")
for _, v := range cand {
if v < L1 || v >= L2 {
fmt.Println(v, "\tout")
} else {
fmt.Println(v, "\tin")
}
}
}
This is the output:
Testing for [ 1.2 , 4.5 )
-Inf out
0.5 out
1.2 in
2.85 in
4.5 out
5.5 out
+Inf out
NaN in
NaN, the notorious design flaw in IEEE 754, passes the test O_o What we should have written is:
if !(L1 <= v && v < L2) {
which fixes the test:
Testing for [ 1.2 , 4.5 )
-Inf out
0.5 out
1.2 in
2.85 in
4.5 out
5.5 out
+Inf out
NaN out
Logically those are the same tests, but NaN breaks mathematical logic. This is why (among other reasons) it is a notorious design flaw! With chained interval checks, we comfortably write:
if ! L1 <= v < L2 {
and everything works ;)
I'm somewhat concerned about the short-circuiting behavior. In an expression like f1() < x < f2()
we will see that f1
is always called but f2
is only sometimes called. I think that is potentially confusing.
It is similar to g1() && g2()
. Any programmer learns that g1()
will and g2()
may be called. If the programmer definitely needs to call f2, she can always write f2() > x > f1()
.
So I dont think that's any different.
It's different because &&
and ||
always short-circuit. <
and friends only short-circuit in a specific case.
Sorry, mobile phone. Ian, can you explain what you mean with an example?
When I write the expression f1() && f2()
I know that f2
will only be called if f1()
returns true
. This is true no matter where the expression occurs.
When I write the expression f1() || f2()
I know that f2
will only be called if f1()
returns false
. This is true no matter where the expression occurs.
When I write the expression f1() < f2()
I know that both f1
and f2
will always be called. Unless I happen to write v < f1() < f2()
, in which case that is no longer true. In v < f1() < f2()
f1
is always called but f2
is only called if v < f1()
is true.
My point is simply that &&
and ||
have consistent behavior with regard to short-circuiting, regardless of what is around them. In this proposal as I understand it, that is not true for <
and friends. They short-circuit depending on the context in which the expression appears.
With f1() && f2()
f1 will and f2 may be called.
With g1() || f1() && f2()
g1 will and f1, f2 may be called.
If you put something in front of an expression, it changes execution possibilities. I don't see this as an inconsistency but a feature of the language.
So I disagree that people would be confused with "f2 may be called" reality.
Also x < f1() < f2()
is just a convenient rewrite of x < f1() && f1() < f2()
except you call f1 once. If you want to call f1 possibly twice, you use the latter.
It is still the &&
operator that does short-circuit.
There is also the issue of number of components allowed for chaining. These are the possibilities:
A) Just 3 components like x < y <= z
B) Up to 4 components like x < y <= z < q
We can also use a similar bash script to identify these:
# Finds 4 component monotone comparisons
name='( [a-zA-Z_][a-zA-Z_0-9.*/+%()-]* )'
nmrf='(?:\1|\2)'
n2rf='(?:\3|\4)'
less='<=?'
more='>=?'
keyw='(?:if|for|case).*'
logi='[^|&]*(?:&&|\|\|)[^|&]*'
patl=(
"$keyw(?:$less$name|$name$more)$logi(?:$nmrf$less$name|$name$more$nmrf)$logi(?:$n2rf$less|$more$n2rf)"
"$keyw(?:$more$name|$name$less)$logi(?:$nmrf$more$name|$name$less$nmrf)$logi(?:$n2rf$more|$less$n2rf)"
)
s=0
for pat in "${patl[@]}"; do
r=$(grep -Pr --include '*.go' "$pat" . | grep -c .)
let s+=r
grep -Prm 1 --include '*.go' "$pat" . | head -n 3
done
echo "$s+ cases"
These are much rarer as expected:
examples from go:
./src/unicode/utf16/utf16.go: if surr1 <= r1 && r1 < surr2 && surr2 <= r2 && r2 < surr3 {
./src/reflect/value.go: if i < 0 || j < i || j > s.Len {
./test/slice3.go: if iv > jv || jv > kv || kv > Cap || iv < 0 || jv < 0 || kv < 0 {
5+ cases
examples from kubernetes:
./vendor/gonum.org/v1/gonum/mat/vector.go: if i < 0 || k <= i || v.Cap() < k {
1+ cases
Only 6+ cases in eleven Go code bases that I've checked. Possibly a couple dozens in all Go code publicly available.
C) 5+ components. I saw one in Go, first example above.
If this proposal would be accepted, my vote is for B because:
x < y <= z < q <= r
can be written as x < y <= z && z < q <= r
x < y <= z < q <= r < t
can be written as x < y <= z && z < q <= r < t
etc.5+ component comparisons can still benefit a lot from chaining. I think this is the right balance. So what do you think?
It's also worth noting that a < b < c
is very different from a < b != c
. The latter is valid Go today, comparing two boolean values.
In the first case c must of an ordered type, in the second c must be bool. So say if you mistyped a <= as !=, it won't compile. If c is a bool, the second still stands valid with monotone chaining, right?
Do you mean the compiler gets some extra difficulty distinguishing these cases?
a < b < c
means a < b && b < c
but a < b != c
doesn't mean a < b && b != c
which is confusing at least - the meaning of the latter we cannot change for backward-compatibility.
Also, currently, the comparison operators simply follow the rules for other binary operators, so we'd have to introduce an irregularity there.
it's easy to apply De Morgan's laws with the current definition. If we allow a < b < c
, the negation would be a >= b || b >= c
which is not the same at all as a >= b >= c
- another source of confusion. Applying De Morgan is a common operation when restructuring conditional code. Doing it in one's head is also a common operation when thinking about invariants of loops, etc.
I am not convinced this form of syntactic sugar - as appealing as it looks - is worth the cost.
Hi Robert, With monotone comparisons, I mean the ones involving <, <=, >, >= in one direction only. For example:
They are called monotone (non-)increasing / (non)-decreasing sequences of numbers. This is standard math terminology. So the following are not, for example, monotone relations:
I am updating the main proposal above to be more clear about monotonicity, and then I will have a follow up about negations.
Hi again Robert,
First, this proposal does not in any way intend to manipulate the fundamental laws of math, like DeMorgan's (they are set in stone but maybe we could propose something about QM ;P that is for another day). In fact, just like you said, people very often utilize DeMorgan laws to rightly manipulate their expressions. Negation is very common:
func Myfun1(x float64) {
if x < L1 || x >= L2 { // check for [L1, L2)
// report bad param
return
}
...
}
This is a very typical example of early return. It is also good practice. Here the programmer wants to accept only finite x from [L1, L2)
. As I have written in detail above on IEEE floats, there is a very serious and infectious bug in this small piece of parameter validation. The notorious design flaw in IEEE 754, NaN, passes this test.
It is not the programmer's fault (it is an archaic hw design flaw), but her responsibility, to take great care about this, unfortunately. What she should have written is:
if !(L1 <= x && x < L2) {
which fixes the test. What we could all enjoy writing instead is:
if ! L1 <= x < L2 {
There are two things here. The first one is monotone chaining which we already talked about. For the second see this:
var x, y float64 // or int
...
if ! x < y { // not valid
...
}
This does not compile because !
binds stronger than <
. This looks unnecessary at first. Why don't you just write x >= y
, right? But for IEEE floats, you have to write !(x < y)
if what you really mean is x >= y
.
So in order to fully utilize monotone chaining, we need to adjust relative priority of !
and < <= > >=
as well, honestly.
Here are two examples from Go itself:
./src/unicode/utf16/utf16.go: if surr1 <= r1 && r1 < surr2 && surr2 <= r2 && r2 < surr3 {
./src/reflect/value.go: if i < 0 || j < i || j > s.Len {
Here is how we could write them:
./src/unicode/utf16/utf16.go: if surr1 <= r1 < surr2 <= r2 < surr3 {
./src/reflect/value.go: if ! 0 <= i <= j <= s.Len {
Personally I find the latter much more readable and clear. It expresses your intent much better. I rest my case ;)
@jfcg I didn't mean to imply that your suggestion manipulates fundamental laws or math. What I said is that predicate negations using DeMorgan's rules become more complex and somewhat unintuitive.
Regarding your example with NaNs: I don't think that is a convincing example. NaNs are a design flaw (I'd agree with that wholeheartedly), and I suspect almost no numeric code is correct in the presence of NaNs. It's best to avoid them.
Quick question: if we adjust relative priority of !
and < <= > >=
, do we break any existing compiling code?
Quick question: if we adjust relative priority of ! and < <= > >=, do we break any existing compiling code?
Yes. That would be a silent change in the behavior of existing code. We can't do that.
Yes, but I meant is there a specific case? Could you give an example?
Oh, sorry, now I see what you mean. I think you're right: I can't think of any way to use !
with a comparison operator directly today. I think we could in principle permit ! a < b
to mean ! (a < b)
. Although it would mess up the grammar quite a bit.
Just brainstorming :P Operator precedence in Go is:
unary operators
* / % << >> & &^
+ - | ^
== != < <= > >=
&&
||
!
operator can act on bool values only. It cannot interact with arithmetic and bitwise operators directly.
Only last 3 lines of operators above output bool variables. Also comparison operators are actually two classes:
!= ==
are applicable to all types
< <= > >=
are applicable to only ordered types
What do you think of the following operator precedence?
unary operators except !
* / % << >> & &^
+ - | ^
< <= > >=
!
== !=
&&
||
What I am curious about is if this could be non-breaking for existing compiling Go code.
If not what is an example that this precedence breaks? I could not come up with one.
Could we even push !
below == !=
?
We've discussed this quite a bit, and it seems to us that this idea, while sometimes convenient, doesn't seem to meet the "importance" criteria of a language change ("address an important issue for many people"). We're also concerned that the novel short-circuiting behavior isn't a good fit with Go. We don't want to change the operator precedence levels, which are (we hope) simple enough to remember with only five levels (and changing them would likely not be backward compatible).
For these reasons, this is a likely decline. Leaving open for a month for final comments.
I agree with @robpike that
If people would just learn to write them like this: if 'a' <= c && c <= 'z' ...
But there's another situation, somewhat related, but different. Don't know if anybody had problem with it, but I did. Here it is:
if Somewhere.SomeVeryLongVariable == 1 || Somewhere.SomeVeryLongVariable == 4 || Somewhere.SomeVeryLongVariable == 9 || Somewhere.SomeVeryLongVariable == 25 {
...
}
If I were designing a language on my own, I probably wouldn't bother to add "chained intervals", but I definitely would add this:
if Somewhere.SomeVeryLongVariable == 1 || == 4 || == 9 || == 25 {
...
}
And having "or" instead of or a synonym to "||" would also help.
@latitov Your example - as you say - is unrelated to this issue. That said, the specific example you're giving would probably be written as
switch Somewhere.SomeVeryLongVariable {
case 1, 4, 9, 25:
...
which is possible now and which is concise with no repetition. If the comparison is more complex, you can always introduce a temporary variable, which is one reason why we have initialization expressions in if
and switch
statements:
if t := Somewhere.SomeVeryLongVariable; t == 1 || t == 4 || t == 9 || t == 29 { ...
I don't see any reason why expressions should be complicated to support your suggestion given that it reduces each sub-expression from t == value
to == value
, i.e., it saves you from writing a t
and a blank.
It's easy to come up with arbitrary new syntax that simplifies a specific use case - but it's a slippery slope: typically it's not worth the extra complexity introduced into the language.
@griesemer you are right.
Specifically, you are right that this is slippery slope, and that it's unrelated issue. You are right there.
However, you are not right about "some specific use-case". We all come from different backgrounds. Some work in biotech, others in finance, others in Google... Every field sometimes make what is "rare specific use-case", a common thing. For example, I develop industrial automation software, a programs that loop indefinitely 24/7/365, and control temperature, pressure, etc. In this particular field, the following is a common thing:
if SomeState == true && (...that long sequence here...) {
}
and switch/case won't work here. Well, you can nest switch/case inside if
, but it will make it less readable, especially considering that an eye is already trained to see a state-machine whenever there's case/switch. Of course a one can create temporary variable, and that will be a solution... in Go. Because, the Go allows that. Some other language frameworks (specifically industrial IEC-61131-3) doesn't. That's why I wanted it in the first place, why it's on my hot list. That's why when I saw this discussion, I commented. But you are right that it's unrelated, and that Go doesn't need it. But it doesn't need it not because there's case/switch, but because there's no need to define variables 10 screens up the code.
There no further comments relevant to this issue, so closing.
Python comparisons can be chained like:
This means:
This is intuitive and easier to read. This proposal covers monotone relations only (
<,<=,>,>=
in one direction) like:x < y <= z
a > b > c
It does not cover any non-monotone relations like:
p >= q < r
q < w != e
a != s == d
In order to determine places where such chained interval comparisons could be used in Go, we can use an (improved) bash script like:
On some popular projects developed in Go, we get the following examples & totals:
examples from dgraph:
448+ cases
examples from etcd:
276+ cases
examples from frp:
219+ cases
examples from gitea:
590+ cases
examples from go:
1369+ cases
examples from influxdb:
53+ cases
examples from kubernetes:
1524+ cases
examples from moby:
461+ cases
examples from nomad:
769+ cases
examples from prometheus:
550+ cases
examples from terraform:
653+ cases
As seen from 6912+ cases above, many thousands of if / case / for clauses doing interval checks can be made simpler and easier to read. Also, it has advantages for IEEE floats, see below. What do you think?