dotnet / vblang

The home for design of the Visual Basic .NET programming language and runtime library.
290 stars 64 forks source link

Pipe-Forward Operator #165

Open AnthonyDGreen opened 6 years ago

AnthonyDGreen commented 6 years ago

This proposal addresses Scenario #154 and partially addresses Scenario #164.

Summary

I propose adding a new operator to VB that passes its left operand as either the first argument or first operand of its right operand. This is similar to F#'s pipe-forward operator but different in that it passes its operand as the first argument rather than as the "last" as in F#. This difference both eliminates the requirement for function currying and is generally more applicable to how most .NET APIs are designed. In this way the -> operator behaves similarly to an aggregate function in a query Into clause.

PipeExpression
    : Term LineTerminator? '->' LineTerminator? PipeTarget
    ;

PipeTarget
    : Name ArgumentList
    | ParenthesizedExpression '.' LineTerminator? SimpleName ArgumentList
    | ( 'CType' | 'DirectCast' | 'TryCast' ) '(' LineTerminator? Type LineTerminator? ')'
    | CastTarget '(' ')'
    | 'If' '(' LineTerminator? Expression ',' LineTerminator? Expression ( ',' LineTerminator? Expression )? ')'
    | 'Await'
    | 'Not'
    | ( '.' | '!' | '?' | '...' | '.!' ) 
    ;

ArgumentList
    : '(' LineTerminator? Arguments? LineTerminator? ')'
    ;

CastTarget
    : 'CBool' | 'CByte' | 'CChar'  | 'CDate'  | 'CDec' | 'CDbl' | 'CInt'
    | 'CLng'  | 'CObj'  | 'CSByte' | 'CShort' | 'CSng' | 'CStr' | 'CUInt'
    | 'CULng' | 'CUShort'
    ;

Motivation

There are three primary benefits to this feature:

? (Await (Await obj.MAsync()).NAsync()).ToString().Trim().Length
'  5      3     1   2         4         6          7      8
' ^ Execution order ^

? obj.MAsync() -> Await ->
     .NAsync() -> Await ->
     .ToString().Trim().Length
' 1   2           3
'     4           5
'     6           7     8
' ^ Execution order ^
' From Roslyn source.
If Char.IsLetter(text(i)) Then
If text(i) -> Char.IsLetter() Then

If String.IsNullOrEmpty(args(0)) Then
If args(0) -> String.IsNullOrEmpty() Then

obj -> CallByName("Foo")
? str -> Trim()
For i = 0 To arr -> UBound()
? filename -> Path.GetExtension()

' From Roslyn source.
Dim accessor = TryCast(TryCast(getMethod.DeclaringSyntaxReferences(0).GetSyntax(cancellationToken),
                               AccessorStatementSyntax)?.Parent,
                       AccessorBlockSyntax)

Dim accessor = getMethod.DeclaringSyntaxReferences(0).GetSyntax(cancellationToken) 
               -> TryCast(AccessorStatementSyntax) -> ?.Parent 
               -> TryCast(AccessorBlockSyntax)
' From Roslyn source.
Return XElement.Parse(PdbValidation.GetPdbXml(compilation, qualifiedMethodName:=methodName))
Return compilation -> PdbValidation.GetPdbXml(qualifiedMethodName:=methodName) -> XElement.Parse()

' From Roslyn source.
ilImage = ImmutableArray.Create(File.ReadAllBytes(reference.Path))
ilImage = reference.Path -> File.ReadAlBytes() -> ImmutableArray.Create()
For Each item In obj -> ReflectionHelpers.GetPublicProperties()
    ...
Next
Dim remove = Function(s As String, value As String) value.Replace(value, "")

? someString -> remove(badWords) 

Detailed Design

The design for the built-in operators is straight-forward. The design for method invocations requires more explanation.

Precedence and Associativity

The big challenge with this feature is parsing precedence.

It's pretty obvious to which method obj is being passed in this example:

obj -> obj.A() 

It's also obvious to which method obj is being passed in this example:

obj -> obj.A.B.C.D() 

However, it's less immediately obvious to which method obj is being passed in this example:

obj -> obj.A().B.C.D() 

' Is it this:
(obj -> obj.A()).B.C.D()

' Or this:
obj -> (obj.A().B.C.D())

And the default meaning is the later then potentially significant backtracking is now required to expression the former.

There's also an issue here of tooling; what completion options should appear and how should the final expression be formatted to indicate precedence.

To solve these questions this proposal gives -> a higher precedence for argument lists than . and also requires an explicit -> to transition back to normal associativity. This means if the target of the pipe is itself the result of a complex expression it's necessary to wrap that expression in parentheses:

' This isn't valid.
obj -> obj.A().B.C.D() 

obj -> (obj.A().B.C).D()

This still requires parentheses but keeps them after the -> operator rather than requiring backtracking to before the left operand of the -> and it's easier to go back into normal . precedence without any parentheses:

? obj.M() -> N() -> P() -> .ToString().Trim().Length

But consequently to pipe into a method invocation with an implicit receiver requires parentheses:

With someLongExpression
    ? node -> (.Foo.Bar).Baz()
End With

Type-Inference and Overload Resolution

Overload resolution rules shouldn't change. It's an open question whether type-inference should work like extension method reduction where certain type arguments are fixed. This would give an experience consistent with extension methods but would suffer the same problems when constraints block inference entirely (unless we do something smart here).

Drawbacks

It's a new symbolic operator. Its meaning may not be immediately intuitive to a first-time reader.

Alternatives

Unresolved Questions

bandleader commented 6 years ago

dotnet/csharplang#74 (I read recently that things common to both languages could/should be discussed on csharplang)

KathleenDollard commented 6 years ago

There is a deeper conversation on pipe forwarding in dotnet/csharplang#74 @bandleader mentioned. Just because more non-MS people are there, that's probably a good place to shake down ideas.

@AnthonyDGreen I'm not clear on some of your pipe targets in the proposal. I think this is going to need boatloads of examples.

I think I like a pipe looking operator like |>, but I think that is minutiae.

I understand the relation with currying, but I'm not sure dropping the argument is the right fit for VB programmers. A placeholder allows the method call to still look the same, but "hey, put the value right here". This was also suggested in the C# thread, but as I looked at some of the samples, I was a bit concerned that having a bunch of placeholders (like @) where our brains are trained to think the same visual thing (a symbol) means the same thing, where here it is definitely not (except in the abstract, what the last thing had.

Await feels like a core scenario for this proposal, and the approach you took ( -> Await ->) is different than a suggestion in the C# thread ( |> await). I don't yet have an opinion on this, but think I like Await first, but not yet keen on -> Await -> because I'm not sure people think of Await as something you pipe into , and that's what it looks like.

What do each of these do: 'Not' | ( '.' | '!' | '?' | '...' | '.!' )

I think I particularly like the Cast operator.

Dim i As Long
Dim j As Integer
j = i -> Integer ' Yep, blow up if it's too big
rskar-git commented 6 years ago

For the example where:

 ? (Await (Await obj.MAsync()).NAsync()).ToString().Trim().Length

is translated to this:

 ? obj.MAsync() -> Await ->
      .NAsync() -> Await ->
      .ToString().Trim().Length

could the new syntax instead be something like this?:

 ? Await obj.MAsync() -> 
      Await .NAsync() -> 
      .ToString().Trim().Length

(Taking cues from https://github.com/tc39/proposal-pipeline-operator.)

Also, can you elaborate on why -> is preferred to |>?

KathleenDollard commented 6 years ago

I like that prefixing Await in my VB head (the second one).

I like |> because it's a pipe. Is how deeply ingrained that is an indicator that I'm old? It also looks somehow like a take all this (the pipe) and stick it in (the greater than) the next thing. Dunno, maybe just in my head, and not as important as feature.

bandleader commented 6 years ago

@KathleenDollard I had posted some feedback about this on @AnthonyDGreen 's scenario issue #154, including 1) why arrows like -> and |> requiring whitespace might not be a good choice 2) a selection of things that can be passed to the operator. Kindly see my feedback there: #154

ocdtrekkie commented 6 years ago

If Char.IsLetter(text(i)) Then

If text(i) -> Char.IsLetter() Then

In this example, I can say this seems downright confusing to me, readability-wise. It looks like IsLetter requires no arguments. It is a symbol I now have to learn to parse. I'd say almost all of your examples appear to need a lot more decoding. I'd have to stare at them a while to see how they might translate to their traditional equivalents, and that's with those equivalents being right in front of me.

The example above that one is far more confusing. At least I kind've understand the "side text(i) into the first argument of IsLetter()" in this example.

There's a strong likelihood I would look at this code, and then look somewhere else for easier code to read.

Is there anything that could be done with a word-based operator instead which might be easier to parse?

bandleader commented 6 years ago

@rskar-git @KathleenDollard

expr -> .ToString() isn't piping

The expression fragment Await obj.MAsync -> .ToString() strikes me as wrong. You aren't piping the result of Await obj.MAsync into .ToString at all (nor do you need to). What you really mean is to continue the expression without having to backtrack: (Await obj.MAsync).ToString() -- which is not what the piping operator does.

What you're asking for here is basically a "parenthesize everything on the left" unary operator. (And it applies with any operator on the right, not just 'dot': Await GetSomeIntAsync() |> + 5) It's not a bad idea, but this isn't piping.

So how should we perform Await inline without backtracking?

IMHO the problem is much more cleanly solved as @AnthonyDGreen and I proposed, with the pipe as a short operator (highest precedence): (Note: I am using .. as the pipe operator here, as IMHO it makes more sense than the whitespace-surrounded ->, if it is to be a high-precedence operator. Another alternative could be .> See #154.)

obj.MASync()..Await.NAsync()..Await.ToString().Trim()
'Or even format Await() as a virtual function, similar to CType(), DirectCast(), and If().
'More readable IMHO and still understandable: (see #116)
obj.MASync()..Await().NAsync()..Await().ToString().Trim()

The operator becomes a sort of "dot with superpowers" -- because if you think about it, . is already a pipe operator, in the sense that it pipes your expression into one of its member methods -- but with our .. (or similar) operator, we extend that to let you call any function* -- in left-to-right execution order, which is the whole point of piping.

* as well as pass it to Await or a type cast

bandleader commented 6 years ago

Just checked the C# issue. They are proposing to allow 'expression continuation without backtracking' by piping to a new expression, with a @ where the piped value should go:

obj.MAsync() -> Await -> @.ToString().Trim()
'And in the example Integer case:
Await GetSomeIntAsync() -> @ + 5

Not sure if I like it. IMHO my syntax solves this much more cleanly, and in perfect left-to-right order: GetSomeIntAsync()..Await + 5 (Not to mention that it's probably much easier to parse)

lds0m01 commented 5 years ago

I like the piping idea but I think it would be more in keeping with VB if it were an extension to member selection say ".>" and also ".>?" with no surrounding white space. In VB "x op y" is interpreted as "op(x,y)". That’s not what's happening here. I would also require ".<" as a placeholder for the piped value. Requiring the placeholder will provide the following: • It would be confusing to see calls to functions without their required parameters. • VB help could be extended to provide help for the placeholder symbol. • It would be easier to report syntax errors since there would be a place to hang them. • The piped value could be used with any parameter not just the first. It could be used with named ones. • The syntax rules could be simplified to requiring the placeholder symbol after a pipe symbol and before the next one if any. • The VB compiler could automatically insert the placeholder if not typed. But this would make the rules about where and when more complicated.

hartmair commented 5 years ago

I don't see any benefit of the "pipe" operator but decreasing intuiitve readability.

Extension Methods FTW

Extension methods syntax is IMHO a far more better approach. Applying to the examples above:

Increase straight-forwardness by allowing lexical ordering to reflect execution order.

This example ist just about the await keyword. This can be easily fixed with a pseudo-method .Await()

obj.MAsync().Await().NAsync().Await().ToString().Trim().Length

Increase readability by allowing code to follow a natural Subject-Verb-Object order (when appropriate).

This is just because Char.IsLetter should have been an extension method CharExtensions.IsLetter in the first place. Similarly for the other lines in this example

If text(i).IsLetter() Then
If args(0).IsNullOrEmpty() Then
obj.CallByName("Foo")
? str.Trim()
For i = 0 To arr.UBound()
? filename.GetExtension()

Reduce the need for backtracking while typing code.

This is about the TryCast operator as extension method CastingExtensions.TryCast(Of T)

Dim accessor = getMethod.DeclaringSyntaxReferences(0).
  GetSyntax(cancellationToken).
  TryCast(Of AccessorStatementSyntax)?.
  Parent.
  TryCast(Of AccessorBlockSyntax)

Reduce the need for deeply nested invocations.

Again, extension methods are a perfect fit here:

Return compilation.GetPdbXml(qualifiedMethodName:=methodName).Parse()

ilImage = ImmutableArray.Create(File.ReadAllBytes(reference.Path))

This one is interesting as extensions methods don't play well with all method invocations here. I would prefer it like this:

ilImage = File.ReadAllBytes(reference.Path).ToImmutableArray()

Enable calling extension methods on Object with a fluent style:

This is again a perfect match for extension methods changing ReflectionHelpers to ReflectionExtensions instead:

Imports ReflectionExtensions

For Each item In obj.GetPublicProperties()
    ...
Next

Enable calling lambdas (which can't be extension methods) with a fluent style

It already says in the description that extension methods are preferred. Why not allow lambdas to be extension Methods, i.e. scoped extension methods? This would be a separate discussion on how to do this. Just one quick example:

[Extension]
Dim remove = Function(s As String, value As String) value.Replace(value, "")
? someString.remove(badWords)
bandleader commented 5 years ago

@lds0m01 @hartmair I believe you are basically proposing what I suggested here (and also referenced in my comment above.)

I was proposing a .. (double-dot) operator that would let you use any method (or lambda) as an extension method, thus satisfying the motivation for pipe expressions (which is maintaining left-to-right order and removing the need to backtrack and add parentheses, see #165), while improving readability, and more.

I also proposed the ability to use ..Await as well as casts: ..DirectCast(Integer).

@KathleenDollard Would be happy to hear about feedback from the LDT.

hartmair commented 5 years ago

@bandleader Allowing any method regardless of ExtensionAttribute may leed to a strange syntax: for example there are a lot of static methods where the first parameter is of type string that don't fit into fluent style reading (at least they don't to me)

File.Decrypt(Path.Combine(value, otherValue))
value..Path.Combine(otherValue)..File.Decrypt()

or even

value..Combine(otherValue)..Decrypt()

I don't see the real benefit here but than confusion and far less intuiitive readability. If the .NET Framework lacks extension methods, then just adding/allowing the ExtensionAttribute in the right places would be the better choice I think. (Still, the .. operator would be a great choice for C# where magic symbols are usual)

bandleader commented 5 years ago

No feature is intended to be used in 100% of cases. If the feature didn't work with dotted methods, that wouldn't be a reason to shoot it down for the other 95% of cases (instance methods, local static methods, lambdas, etc.)

However, I think the syntax could be adapted for use with dotted methods, using parentheses:

File.Decrypt(Path.Combine(value, otherValue)) // becomes:
value..(Path.Combine(otherValue))..(File.Decrypt) 

That said, I don't think this example is a good candidate for piping in any case; as you indicated, it's clearer without piping. Nevertheless, people who do lots of complex operations using function composition find this feature essential. Giving examples is beyond the scope of the issue here, but feel free to consult docs for languages where this already exists.