dotnet / vblang

The home for design of the Visual Basic .NET programming language and runtime library.
290 stars 64 forks source link

Composable types #47

Open zspitz opened 7 years ago

zspitz commented 7 years ago

I propose that type names can be replaced with composed types, in either of the following forms:

Use cases:

Multitype checking

This syntax allows for checking either of two types (Or), or both of two types (And). Requires no changes to the TypeOf <x> Is <Type> statement, only an expansion of <Type>.

Instead of:

If TypeOf x Is String Or TypeOf x Is IEnumerable(Of Integer) Then

Use:

If TypeOf x Is String Or IEnumerable(Of Integer) Then

Instead of:

If TypeOf x Is MyClass And TypeOf x Is MyInterface Then

Use:

If TypeOf x Is MyClass And MyInterface Then

Parentheses are not required, because the compiler can differentiate between <Type> Or <Type> and <Boolean> Or <Boolean>; but it's important that parentheses be allowed, to improve readability when needed:

Dim flag = True
If (TypeOf o Is String Or Number) Or flag Then
End If

Use members from both type parts (intersection type only)

Define parameters / variables as an intersection of a class and an interface; allows using members from both without having to cast:

Class MyClass
    Public I As Integer
End Class
Interface MyInterface
    Public J As Integer
End Interface

Dim x As MyClass And MyInterface
Console.WriteLine(x.I)
Console.WriteLine(x.J)

Consolidate function overloads (union types only)

Reduce the boilerplate function overloads needed for multiple types. Instead of this:

Sub PrintStrings(toPrint As IEnumerable(Of String))
    For Each s In toPrint
        Console.WriteLine(s)
    Next
End Sub
Sub PrintStrings(toPrint As String)
    PrintStrings({toPrint})
End Sub

Use this:

Sub PrintStrings(toPrint As IEnumerable(Of String) Or String)
    If TypeOf toPrint Is String Then toPrint = {toPrint}
    For Each s In toPrint
        Console.WriteLine(s)
    Next
End Sub

Type aliases

VB.NET already has a syntax for type aliasing:

Imports MyType = Namespace.Type

This should support using composable types:

Imports MyType = Namspace.Type1 Or Namespace.Type2
Imports MyType1 = Namespace.Class1 And Namespace.Interface1

Potential Issues

  1. Some static flow analysis would make union types much more useful. The flow analysis would limit the type within a type check, or after an assignment (this is really just an extension of #172):

    Sub PrintStrings(toPrint As IEnumerable(Of String) Or String)
        If TypeOf toPrint Is String Then
            'The compiler should be aware that at this point the object pointed to by toPrint must be of type String
            toPrint = {toPrint} 'Even though the value of toPrint is of type String, the assignment to the toPrint variable is still allowed
            'and from here on, toPrint is IEnumerable(Of String)
        End If
    
        For Each s In strings
            Console.WriteLine(s)
        Next
    End Sub
  2. There already exists a syntax for intersection types, when used with generic constraints:

    Public Class MyClass(Of T As {IComparable, IDisposable}) End Class

    However, I don't see how it can be extended for union types as well.

    OTOH, I don't see any reason not to allow composite types in generic constraints:

    Public Class MyClass(Of T As IComparable And IDisposable) End Class Public Class MyClass(Of T As IComparable Or List(Of Integer)) End Class

  3. Is there CLR support for union and intersection types? (See F#.)

  4. How could such members / classes be represented for compatibility with other languages? (Also see F#.)

  5. How would reflection work with these types? (Again, see F#.)

Contexts

Composable types should be allowed in the following contexts:

They should not be allowed in the following contexts:

Links

zspitz commented 7 years ago

@AdamSpeight2008 Regarding this:

if (TypeOf obj Is TypeA Or TypeB ) then could be an issue with operator precendance. Eg Being treated as TypeOf( obj Is (TypeA Or TypeB ) )

Isn't TypeOf <x> Is <Type> the only valid syntax for TypeOf? Quoting from the docs:

TypeOf is always used with the Is keyword to construct a TypeOf...Is expression, or with the IsNot keyword to construct a TypeOf...IsNot expression.

Presumably, obj Is (TypeA Or TypeB) would resolve to a Boolean. What would the meaning of TypeOf <BooleanExpression> be?

AnthonyDGreen commented 7 years ago

@zspitz we already support a limited form of intersection types in a generic method or the because the constraints are all intersected. I've thought a lot about how useful this would be other places. In fact, when we looked at adding interfaces to the Roslyn syntax tree, one thing that stopped us was the inability to specify a parameter as SytaxNode AND IXyzSyntax. This meant sacrificing all the members on SyntaxNode when using the interface type or casting constantly, so I love the idea.

The thing that concerns me the most is, ironically, verbosity and repetition in method signatures. Your proposal for type aliasing reduces it somewhat, though. I'm less optimistic about union types though. It seems rarer that I'd need them, and a bit unwieldy to use them. I guess one benefit @gafter would bring up is that they do give you a way to enforce that a Select Case/pattern match handles all cases. I know it trees us from having to define type hierarchies that depend on base classes. But do people often have code that can work with an Apple or a CarEngine?

zspitz commented 7 years ago

@AnthonyDGreen @gafter I've rewritten and expanded this proposal; specifically the use case for union types.

zspitz commented 6 years ago

@AnthonyDGreen

they do give you a way to enforce that a Select Case/pattern match handles all cases

Could you clarify this? I don't see how the two are related.


But do people often have code that can work with an Apple or a CarEngine?

Perhaps not. But there is often code that works on some end-type, and there are some intermediate types which need to be mapped to the end-type somehow. Currently, such a variable has to be defined as Object:

Dim toPrint As Object = ReturnsSingleOrMultipleStrings 'return type of Object
If TypeOf toPrint Is String Then toPrint = {toPrint}
For Each s In toPrint
    Console.WriteLine(s)
Next

which means no type-safety or Intellisense:

toPrint = new Random ' no compilation error

However, using union types:

Function ReturnsSingleOrMultipleStrings() As IEnumerable(Of String) Or String
End Function

Dim toPrint = ReturnsSingleOrMultipleStrings

'the following line would be a compilation error; Random is not compatible with either String or IEnumerable(Of String)
'toPrint = new Random 

'the following line would also fail to compile, because toPrint might be an IEnumerable(Of String)
'Console.WriteLine(toPrint.Length)

If TypeOf toPrint Is String Then
    '"effective type" of toPrint is now String
    Console.WriteLine(toPrint.Length) 'will now compile

    toPrint = {toPrint} 'this assignment is allowed by the original definition of toPrint
    '"effective type" of toPrint is now IEnumerable(Of String)
End If

'Since all branches of the If result in toPrint being an IEnumerable(Of String), the effective type of toPrint is IEnumerable(Of String)
Dim s1 = toPrint.FirstOrDefault ' s1 is typed as String
For Each s In toPrint
    Console.WriteLine(s)
Next

A current workaround would be to introduce a temporary variable:

Dim temp = ReturnsSingleOrMultipleStrings 'return type of Object
Dim toPrint As IEnumerable(Of String)
If TypeOf temp Is String Then
    toPrint = {temp}
ElseIf TypeOf temp Is IEnumerable(Of String) Then
    toPrint = temp
Else
    Throw New InvalidOperationException
End If

But that means cluttering up the code with a new name for the sole purpose of type compatibility, and is also less concise than this:

Function ReturnsSingleOrMultipleStrings() As IEnumerable(Of String) Or String
End Function

Dim toPrint = ReturnsSingleOrMultipleStrings
If TypeOf temp Is String Then toPrint = {toPrint}

a bit unwieldy to use them

I'm not quite sure what you mean. True, it's simpler to type as Object, but also less typesafe. For the use case of overloads which exist only for the purpose of mapping a value from one type to the next and add no other functionality, I would think multiple overloads would be more unwieldy.

zspitz commented 6 years ago

Pinging @KathleenDollard

KathleenDollard commented 6 years ago

Interesting. I'd like other folks thoughts. I'm seeing moderate improvements in keystroke/line count with a new layer of conceptual thinking. And, I don't know how we'd implement it, particularly as a language feature, unless you envision simply expanding the shorthand approach to the longer form in the examples you showed.

Bill-McC commented 6 years ago

I think it adds complexity to the code: you need to have in mind different types and look at all the code branching to know what type you are dealing with. And you'd have code that would be fragile as in difficult to refactor. You'd have cases where calling Foo(bar) in the same method would result in compilation failures even though bar was in scope simply because it was a different type. In all, I don't see what it adds compared to what it costs as suitable.

zspitz commented 6 years ago

@KathleenDollard

moderate improvements in keystroke/line count with a new layer of conceptual thinking

The primary benefits are a more precise representation of the logic behind the code:

Improvements in keystroke/line count are a good thing; but I don't think that justifies such a change. However, I think making code a better reflection of what happens at runtime is a worthwhile goal.

a new layer of conceptual thinking

Isn't that a good thing as long as it's within the design goals of the language?

a new layer of conceptual thinking

I think it's not as earthshattering as all that. With Dim s As String, I am setting down ground rules for the proper use of the object/value behind s -- can be concatenated with other strings to form a new string using +, has a Length property etc. Using composable types, I am setting down a slightly more flexible ground rule -- this object/value is either a String or an Integer; or this object/value both inherits MyClass and implements MyInterface.

I don't know how we'd implement it

For intersection types, we could use a helper Either(Of T1, T2) type (inspired from a suggestion in the corresponding C# proposal:

Public Structure Either(Of T1, T2)
    ReadOnly Property Item1 As T1
    ReadOnly Property Item2 As T2
    Public ReadOnly IsFirst As Boolean?
    Sub New(Item1 As T1)
        Me.Item1 = Item1
        IsFirst = True
    End Sub
    Sub New(Item2 As T2)
        Me.Item2 = Item2
        IsFirst = False
    End Sub
End Structure

and the compiler rewrite could look like this:

' assignment
'Dim x As String Or Integer
'If rnd.NextDouble > 0.5 Then
    'x = "abcd"
'Else
    'x = 5
'End If

Dim x As Either(Of String, Integer)
If rnd.NextDouble > 0.5 Then
    x = New Either(Of String, Integer)("abcd")
Else
    x = New Either(Of String, Integer)(5)
End If

' see PrintStrings sub above
Sub PrintStrings(toPrint As Either(Of IEnumerable(Of String), String))
    If Not toPrint.IsFirst Then
        toPrint = New Either(Of IEnumerable(Of String), String)({toPrint.Item2})
    End If

    For Each s In toPrint.Item1
        Console.WriteLine(s)
    Next
End Sub

Union types could be implemented with a similar helper type:

Public Structure Both(Of T1, T2)
    ReadOnly Property AsT1 As T1
    ReadOnly Property AsT2 As T2
    ReadOnly Property Initialized As Boolean
    Sub New(AsT1 As T1, AsT2 As T2)
        Me.AsT1 = AsT1
        Me.AsT2 = AsT2
        Initialized = True
    End Sub
End Structure

and given these classes:

Public Class BaseClass
    Property I As Integer
End Class
Public Interface MyInterface
    Property J As Integer
End Interface
Public Class DerivedClass
    Inherits BaseClass
    Implements MyInterface
    Property J As Integer Implements MyInterface.J
End Class

the compiler rewrite could look like this:

'Dim derived = New DerivedClass
'Dim y As BaseClass And MyInterface = derived
'Console.WriteLine(y.I)
'Console.WriteLine(y.J)
'Console.WriteLine(y.ToString())

Dim derived = New DerivedClass
Dim y = New Both(Of BaseClass, MyInterface)(derived, derived)
Console.WriteLine(y.AsT1.I) ' because I is a member of BaseClass
Console.WriteLine(y.AsT2.J) ' because J is a member of MyInterface
Console.WriteLine(y.AsT1.ToString()) 'because ToString is part of a shared base class (Object.ToString) this can be arbitrary

Type checking against an intersection type would look like this:

'If TypeOf z Is String Or Integer Then
If TypeOf z Is String Or TypeOf z Is Integer Then

Type checking against a union type would look like this:

'If TypeOf z Is BaseClass And MyInterface Then
If TypeOf z Is BaseClass And TypeOf z Is MyInterface Then

N.B. There is still a further issue of how to handle multiple levels of composed types.

zspitz commented 6 years ago

@Bill-McC

I think it adds complexity to the code

Only as a reflection of the complexity of the underlying code logic. Currently, if my code deals with some object/value which I know is either a String or an Integer, I cannot describe this to the compiler; I have to drop down to Object in the declaration in order to allow for both possibilities (unless I introduce additional variables). Once I am using Object, I could potentially assign to it something which is neither String nor Integer. If Option Strict is on, I will have to check the type at runtime and cast to String or Integer in order to make use of the appropriate value; and if Option Strict is off, I run the risk of using members of String when the value is actually Integer, and vice versa.

If my code deals with something that I know is both a BaseClass and implements MyInterface, I cannot tell the compiler about it either; I will have to cast to either type in order to make use of relevant members.

you need to have in mind different types and look at all the code branching to know what type you are dealing with.

In the scenarios where intersection types add value, you have to do that anyway: e.g. in this branch, the Object refers to a String, but in that branch the Object refers to an Integer.

you'd have code that would be fragile as in difficult to refactor

Could you elaborate on this?

You'd have cases where calling Foo(bar) in the same method would result in compilation failures even though bar was in scope simply because it was a different type.

Is that a bad thing? The alternative is to have a runtime failure of the call to Foo(bar) at this point, because bar refers to an object/value of the wrong type.

Note that there is a proposal to enforce this in general for the TypeOf statement (#172).


All of these seem to be criticisms of the intersection types part of my proposal, and are not relevant to the union types, or for multitype checking.

zspitz commented 6 years ago

@Bill-McC @KathleenDollard @AnthonyDGreen Any further thoughts on this?

Bill-McC commented 6 years ago

Don't have anything to add as such. Not sure the case of string or int is an intersection... the only thing in common is ToString. The case of baseclass with an interface dies require casting. In days of com, there would be an IBaseClass. Thoughts around this lead me towards duck typing, and psuedo types and interfaces. That I think is worth looking at, think it may still be open under a few proposals here. When there us commonality, worth solving. When there is none, Object and casting is clearer .

Regards, Bill.


From: Zev Spitz notifications@github.com Sent: Tuesday, April 24, 2018 11:20:18 AM To: dotnet/vblang Cc: Bill-McC; Mention Subject: Re: [dotnet/vblang] Composable types (#47)

@Bill-McChttps://github.com/Bill-McC @KathleenDollardhttps://github.com/KathleenDollard @AnthonyDGreenhttps://github.com/AnthonyDGreen Any further thoughts on this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dotnet/vblang/issues/47#issuecomment-383772217, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AID_hNNDN5jkYba4a8js2U5fYcs8n-oDks5trn3SgaJpZM4Maovr.

KathleenDollard commented 6 years ago

I'm not yet sold on this feature. @CyrusNajmabadi has some good comments about why TypeScript needed this and maybe .NET doesn't in the C# of intersection / union types. I want to watch how thinking evolves on this, but I don't yet see the benefit ratio against the amount of work across languages and type system.

KathleenDollard commented 6 years ago

This is the sort of thing that would want to have serious discussions about whether to do in the language (via erasure, your suggestion) or with BCL support. Either way, this is likely to be something we'd want a broad base of support and probably having C# take the lead on a rather technical issue.

zspitz commented 6 years ago

@KathleenDollard There is one possible reason why this would be more attractive in VB.NET over C# -- "classic" VB didn't have overloads and was limited to a single function name within a given scope; whereas C#'s syntax is in the main inspired by C-like languages, which support function overloads. I am not sure if this is still a design goal in VB.NET, but after all VB.NET had the Optional keyword from day one, while it took a few releases for similar functionality to be made available to C#.

Typescript can serve as a model for the transition from a language without composite types to a language with composite types; but I think the more appropriate comparison is to F#'s composite types, as F# is a statically typed language while still having composite types.


I'm currently rewriting the Typescript definitions for the Excel object model, and this benefit is actually quite striking. Consider this definition, autogenerated from the registered type lib information:

interface Sheets {
    // ....
    Item(Index: any): any;
    // ...
}

What arguments can be passed into the Item method? What will it return? The answer to that requires a context switch to the browser, browsing to the Excel Sheets object, finding the Item method, and reading the documentation; followed by another cognitive context switch, and hopefully passing in the arguments correctly.

Arguably, it's possible (and advisable) to embed the documentation in the file, and it would then be available in the editor. However, I still have no guarantee that the values I pass in are of the correct type for this context.

But the following definition describes the possible values simply and concisely, and also enforces it at compile time:

interface Sheets {
    // ....
    Item(Index: string | number): Sheet;
    Item(Indexes: SafeArray<string | number>): Sheets;
    // ...
}

Now, this could be expressed using overloads:

interface Sheets {
    // ....
    Item(Index: string): Sheet;
    Item(Index: number): Sheet;
    Item(Indexes: SafeArray<string>): Sheets;
    Item(Indexes: SafeArray<number>): Sheets;
    // ...
}

but now I have four method signatures to read instead of two (aside from the overloads not being quite accurate, because a single SafeArray could hold both numbers and strings).

bandleader commented 6 years ago

@zspitz Any further thoughts on this?

Just that for me, the main use case would be union types, for simplifying overloads (as you mentioned as well). i.e. Sometimes I can take a URI or a string, etc., because I'll just convert one to the other... so typing it as URI Or String would save an overload And if I have two parameters that can be either a URI or a string, then I save three overloads.

Also, I can even abstract away the conversion code:

Function Foo(one As (URI Or String), two As (URI Or String))
  Dim getUri = Function(s As (URI Or String)) If(TypeOf s Is URI, s, New URI(s))
  Dim uriOne = getUri(one), uriTwo = getUri(two)
  'All done! Use uriOne and uriTwo here...
End Function
Berrysoft commented 6 years ago

If you write Either(Of T1, T2) with two Operator CType:

Public Structure Either(Of T1, T2)
    Public ReadOnly Property Item1 As T1
    Public ReadOnly Property Item2 As T2
    Public ReadOnly IsFirst As Boolean?
    Public Sub New(Item1 As T1)
        Me.Item1 = Item1
        IsFirst = True
    End Sub
    Public Sub New(Item2 As T2)
        Me.Item2 = Item2
        IsFirst = False
    End Sub
    Public Shared Widening Operator CType(value As T1) As Either(Of T1, T2)
        Return New Either(Of T1, T2)(value)
    End Operator
    Public Shared Widening Operator CType(value As T2) As Either(Of T1, T2)
        Return New Either(Of T1, T2)(value)
    End Operator
End Structure

You can write a function like this:

Function Foo(one As Either(Of Uri, String), two As Either(Of Uri, String))
    Dim getUri = Function(s As Either(Of Uri, String)) If(s.IsFirst, s.Item1, New Uri(s.Item2))
    Dim uriOne = getUri(one), uriTwo = getUri(two)
    'Use uriOne and uriTwo here...
End Function

And call it like:

Foo("https://www.google.com/", New Uri("https://github.com/"))
zspitz commented 6 years ago

@Berrysoft True. However, since we're talking about the compiler rewriting this:

Function Foo(one As String Or Uri)
    If TypeOf one Is String Then one = New Uri(one)

    'do something with Uri object
    Console.WriteLine(one.Fragment)
End Function

to something like this:

Function Foo(one As Either(Of String, Uri))
    If one.IsFirst Then one = New Either(Of String, Uri)(New Uri(one.Item1))

    'do something with Uri object
    Console.WriteLine(one.Item2.Fragment)
End Function

I'm not sure if there would be any benefit if the compiler would rewrite to:

Function Foo(one As Either(Of String, Uri))
    If one.IsFirst Then one = New Uri(one.Item1) 'leveraging the CType operator overload

    'do something with Uri object
    Console.WriteLine(one.Item2.Fragment)
End Function
zspitz commented 5 years ago

@Berrysoft Of course, for interop purposes, the casting operators would be very valuable; and in fact the F# compiler does this with discriminated unions with multiple types.

zspitz commented 3 years ago

Shoutout to the OneOf library, which allows creating union types in .NET.