dotnet / csharplang

The official repo for the design of the C# programming language
11.53k stars 1.03k forks source link

Proposal: Ternary comparison operator #4108

Open CyrusNajmabadi opened 4 years ago

CyrusNajmabadi commented 4 years ago

Ternary comparison operator

Summary

This would allow users to write a simplified x < y < z test as shorthand for x < y && y < z.

Detailed design

This code is already parseable today and already has meaning. Indeed, here is a (albeit pathological) case where this would compile today:

using System;
public static class Program {
    public static void Main() {
        var (x, z) = (new C(), new C());
        Console.WriteLine(x < 5 < z);
    }
}

public struct C
{
    public static bool operator <(C x, C y) => true;
    public static bool operator >(C x, C y) => true;

    public static C operator <(C x, int y) => x;
    public static C operator >(C x, int y) => x;
}

In order for this type of code to compile today, you'd need to do very strange operator overloading which would not be expected in practice. However, because of back compat, we would likely have to support this.

As such, when processing a binary expression of hte form expr1 op expr2 op expr3 we would have to bind in the same fashion as today. However, if that form failed to bind the operators successfully, we would now reinterpre the above as:

expr1 op1 expr2 op2 expr3, where op1 and op2 are one of >, <, >=, <= is reinterpretted as:

var __t1 = expr1;
var __t2 = expr2;

var r = false;
if (__t1 op1 __t2)
{
    var __t3 = expr3;
    if (__t2 op2 __t3)
        r = true;
}

This should match intuition here and would mean code executes (including order of evaluation and short-circuiting) as expected.

Drawbacks

Potential confusion over a piece of code potentially having two different meanings. However, this is so unlikey as no real codebases should ever have been using a < b < c. It just isn't a reasonable code pattern today, so no one uses it or expects it to work the way it does today. Practically all users looking at this would expect it to have the semantics this proposal is suggesting.

Design meetings

HaloFour commented 4 years ago

It was noted that there is a NuGet package which uses an extension method and some overloaded operator chicanery to mimic this language feature: https://github.com/dotnet/roslyn/issues/136#issuecomment-71959890

However, if the compiler will only treat this expression as a range comparison if it wouldn't otherwise compile then I would imagine any existing code that might rely on this package would continue to compile and work as expected. But it should probably be included as a part of the test suite to ensure no unintended breaks.

CyrusNajmabadi commented 4 years ago

Good to know. This would fall under the portion of the design of: we will existing semantics and always interpret that way if it succeeds. We only use the new semantics if it doesn't.

That example is also interesting in that it's trying to provide these semantics. So a great part about this is that once we have this feature, you can stop using that lib :)

YairHalberstadt commented 4 years ago

we would have to bind in the same fashion as today.

How would this be defined to avoid making further changes to binding not overly complex?

For example if C# adds increases the number of places implicit operators can be used in the future, or changes lookup, or improves overload resolution, we would need to make sure all those things run only after we check if it matches this new ternary comparison operator. This might force the binding code to become very contorted to allow this.

CyrusNajmabadi commented 4 years ago

honestly, i would say: given an tree of the form

      op1
     /  \
expr1    op2
        /  \
   expr2    expr3

We have the existing rules. We then have a clause htat says that if that produces an error (which we use terminology for in lambda-resolution) then rebind with such-and-such expected semantic rewrite. In general we discuss validity, and binding-time errors. This would apply here. If the normal interpretation results in an invalid (or binding-time-error), we try the new interpretation.

I'm not sure any of the cases mentioned so far necessary matter. Or, if they do, they are starting to get to the cornerest corner of all corners :)

alrz commented 4 years ago

Linking back to my comment here https://github.com/dotnet/csharplang/discussions/4106#discussioncomment-122571

Since you're going to deal with some semantic ambiguities anyways, the postfix pattern does seem to be a viable alternative, except that there will be syntax ambiguities instead (but not in any useful scenarios mentioned here).

Opened a new discussion: https://github.com/dotnet/csharplang/discussions/4110

LRC-H commented 4 years ago

很好

huoyaoyuan commented 4 years ago

Can there be a warning wave when bindinh to the old behavior?

CyrusNajmabadi commented 4 years ago

Can there be a warning wave when bindinh to the old behavior?

I don't see any point in doing that

hez2010 commented 4 years ago

There may be some ambiguities need to be resolved, such as A < B < C > D > F.

AdamSpeight2008 commented 3 years ago

Wouldn't this potentially break compatibility? Those operators are not guaranteed to represent comparisons. the user is free to give them different semantics.

AdamSpeight2008 commented 3 years ago

This would have to be under a feature flag, so that the change in semantic doesn't change errors in previous version.

Can work just via the binder and lowering. eg. a < b <= c Requires all the operands a b c to be the same type; and that that type have operator < equivalent to Func< T, T, Bool>. and that that type have operator <= equivalent to Func< T, T, Bool>. Then in lowering compiler generate ( (a < b) && (b <= c).

It was noted that there is a NuGet package which uses an extension method and some overloaded operator chicanery to mimic this language feature: dotnet/roslyn#136 (comment)

I wrote that nuget package. Note it use IComparable(Of T) has a the check to see if the type has "comparison operators". It also introduces 2 new class types _0(Of T) (the lifted type) and _1(Of T) to provide access to the second "comparison operator.

canton7 commented 3 years ago

Compatibility concerns are fully addressed in the OP, no?

CyrusNajmabadi commented 3 years ago

Wouldn't this potentially break compatibility?

No. The specification can effectively be read to say:

If this has legal semantics under C# 9. The use those semantics. Othewise, try the new semantics. So nothing can break.

AdamSpeight2008 commented 3 years ago

OP doesn't mention if a b c can or can not be different types. My addition to the proposal, by restricting to only the same focuses it to the conventional comparison operations. Func<Func<T, T, Bool>, T, Bool> Which is currently an error.

CyrusNajmabadi commented 3 years ago

I wrote that nuget package.

Your package will retain its semantics.

CyrusNajmabadi commented 3 years ago

OP doesn't mention if a b c can or different types.

It can operator on different types as per the rewrite rule i've specified. The rule is agnostic to that.

AdamSpeight2008 commented 3 years ago

@CyrusNajmabadi Are you restricting to the "comparison" operator only if it returns boolean? As if wasn't, it would cause that nuget package to have incorrect semantics.

lowerValue.__() <= value <= upper
' types produced during evaluation.
T.__()  --> _0(Of T) ' The "Lifted" type
_0(Of T) <= T --> _1(Of T)  ` provide the first operators (<, <=)
_1(Of T) <= T --> Boolean   ' provides the second operators (<. <=)
CyrusNajmabadi commented 3 years ago

@CyrusNajmabadi Are you restricting to the "comparison" operator only if it returns boolean?

No

CyrusNajmabadi commented 3 years ago

As if wasn't, it would cause that nuget package to have incorrect semantics.

Any existing semantics would be preserved.

the general intuition for the algorithm is:

  1. try existing C# 9.0 semantics. If that succeeds, those are the semantics to use.
  2. otherwise, if that fails, interpret a < b < c < d < e as a < b && b < c && c < d && d < e and try to bind. if that succeeds, emit in that fashion.
  3. otherwise, fail.
NN--- commented 3 years ago

With C# 9 it can be written a is > 5 and < 10 which doesn’t have ambiguity and allows to mix and match comparisons.

FaustVX commented 3 years ago

@NN---

With C# 9 it can be written a is > 5 and < 10 which doesn’t have ambiguity and allows to mix and match comparisons.

No, because, Pattern Matching works only with constant, and with this proposal, it could works with any value (variables, properties, …)

Daynvheur commented 3 years ago

Little comment here (thanks to @alrz for the link) as I've opened a very similar discussion at #4980 , but with an or comparison. Thus, question: Would this ternary comparison only work on and mode? if not, how would you express A < C [or] A > B?

Xyncgas commented 2 years ago

Potential confusion over a piece of code potentially having two different meanings.

We have <= and => today, tell me they are the same thing.

Korporal commented 1 year ago

There may be some ambiguities need to be resolved, such as A < B < C > D > F.

What ambiguity is in that fragment? Creating the parse tree for that is not ambiguous so far as I can see.

CyrusNajmabadi commented 1 year ago

It would need appropriate lookahead to ensure that generics, variables, and comparisons were properly handled. It's likely not a full ambiguity, but a local one.

333fred commented 1 year ago

What ambiguity is in that fragment?

Is it A<B<C>D> F (a declaration of a variable F with type A<B<C>D> and a syntax error on a missing ,) or a chained comparison operator?

Korporal commented 1 year ago

What ambiguity is in that fragment?

Is it A<B<C>D> F (a declaration of a variable F with type A<B<C>D> and a syntax error on a missing ,) or a chained comparison operator?

Yes I see that when there's no context, but in the case of an if statement we have the context, the grammar rules of the if.

A declaration isn't legal (well I can't see it myself) inside the expression that's part of an if. The parser would only encounter this when parsing a conditional expression. C# already correctly distinguishes between generics and expressions.

C# seems to already parse this correctly too:

int A = 0;
int B = 0;
int C = 0;
int D = 0;
int F = 0;

if (A<B<C>D>F)
{
    ;
}

The diagnostic it reports is not about syntax but a semantic problem, the types that are used either side of the operator <.

This too suggests that if the semantic checking were replaced, the expression interpreted as a chained compare, it would work.

FaustVX commented 1 year ago

There may be some ambiguities need to be resolved, such as A < B < C > D > F.

As far as I can test, the compiler is already happy with that syntax sharplab.io

CyrusNajmabadi commented 1 year ago

Yes I see that when there's no context, but in the case of an if statement we have the context, the grammar rules of the if.

Yes. This is an example of us addressing the potential issue. The situations brought up are there so we ensure we think through it and make sure the language is spec'ed such that it is not an issue and so that the compiler operates as expected with appropriate tests.

theking2 commented 1 year ago

I really don't understand this. There is also ambiguity with another ternary operator ?:. It was addressed there why is op cmp op cmp op different? Clearly there are ambiguities but with ?: we seem perfectly fine with those.

CyrusNajmabadi commented 1 year ago

I really don't understand this. There is also ambiguity with another ternary operator ?:

What ambiguity are you referring to?

Clearly there are ambiguities but with ?: we seem perfectly fine with those.

No one said we are not fine with ambiguities... As i said in the post above yours, we would just need to be cognizant so we can spec hte language out properly, and ensure the compiler does the right thing.

theking2 commented 1 year ago

what does a? b? c: d: do and what does a < b < c < d do. both ambiguous . () needed to fix this

CyrusNajmabadi commented 1 year ago

a? b? c: d:

The code is incomplete and will be a failure to parse. Did you miss a e at the end? I'm not sure if there's an accidental omission or a purposeful one.

Assuming it's accidental, and you meant: a? b? c: d: e, then there is no ambiguity here (syntactic or otherwise).

Can you please clarify?

CyrusNajmabadi commented 1 year ago

and what does a < b < c < d do

Currently it is syntactically legal, and may have semantic meaning in esoteric cases. This proposal discusses giving it legal meaning in the case where it is illegal today.

Daynvheur commented 1 year ago

Random concerns:

CyrusNajmabadi commented 1 year ago

What would be the interest (if any) of a non- short-circuited version of this? (Making sure that a < b < c < d < e do evaluate the e part.)

My thought here is that this is shorthand for a < x && x < b. So it shoudl have the same shortcircuiting behavior.

Syntactic analysis simplifications: a < 3 < 1 could be simplified as a < 1? (a required to be less than 3 and less than 1 is satisfied with just a required to be less than 1.)

Looks like the space for an optimization pass, or an analyzer message. Not really in scope for the language itself.

Warn about impossible conditions:

That seems reasonable. We already do constant analysis and requisite flow control for similar htings today:

image

So this could similarly evaluate to always being false (if 'a' is an integer of course).

MgSam commented 5 months ago

C++ is also looking to add this; albeit with disallowing the potentially ambiguous forms like a < b > c. I think if C# adds this the ambiguous forms should also be disallowed; they add no benefit.