twinbasic / lang-design

Language Design for twinBASIC
MIT License
11 stars 1 forks source link

Constrained generics #17

Open FullValueRider opened 3 years ago

FullValueRider commented 3 years ago

Is something like the below on the roadmap for twinbasic?


module GenericDayDreaming

    Public Incrementable as TypeDescription = Byte|Integer|Long|Date|Single|Double

    public sub DoSomething(of T in Incrementable)(A As T)

    end sub

end module
WaynePhillipsEA commented 3 years ago

Yes, I think we should do something in this area. Using the VB.NET syntax, it would be something like this:

Public Sub DoSomething(Of T As {Byte, Integer, Long, Date, Single, Double})(A As T)
End Sub

Though I admit I'm not all that keen on the curly braces. Also, might need to consider having default groups like 'Numeric' to simplify.

FullValueRider commented 3 years ago

Whichever format/syntax you eventually decide on it would be great if it could also be used against variables declared as variant or object. You could make the curly brackets optional.

e.g. something like

myVar as variant of Byte|Integer| Long| Date|Single|Double
WaynePhillipsEA commented 3 years ago
Public Sub DoSomething(Of T As Byte Or Integer Or Long Or Date Or Single Or Double)(A As T)
End Sub

I prefer this syntax to the curly braces, and the semantic highlighting will help make the datatypes clearer. Still open to suggestions though.

myVar as variant of Byte|Integer| Long| Date|Single|Double

A restricted-Variant type would impact runtime performance due to all the extra runtime type checking that would need to be done.

bclothier commented 3 years ago

In the case of restricted-Variant type, you can enforce it at compile-time without any runtime penalty by disallowing it for parameters of a Public method, and allowing it only for internal use. That would in turn require an explicit conversion from a plain variant into a restricted variant type.

Incidentally, that would dovetail nicely with the twinbasic/lang-design#11...

mwolfe02 commented 3 years ago
Public Sub DoSomething(Of T As Byte Or Integer Or Long Or Date Or Single Or Double)(A As T)
End Sub

This feels much more like BASIC than curly braces. One of the things that has always made BASIC accessible to new users is its preference for plain English over symbols with special meanings. The price for that is often more verbose code.

The Or syntax honors that philosophy more so than curly braces.

FullValueRider commented 3 years ago

I assumed that variant/object method parameters could be mapped easily to a generic equivalent 'behind the scenes', however I'd have to admit that it might be necessary to drop constrained variants/objects just declared as variables as there doesn't seem to be a 'generic' mapping available.

If I had to express a preference it would be to allow constrained variant/object as method parameters, and to accept that constrained object/variants will not be possible for ordinary variables.

mansellan commented 3 years ago

Not sure about allowing constraints to specify any of semantics (with Or). Consider the following:

Public Sub DoSomething (Of T As Excel.Range OR Word.Range) (Range As T)
End Sub

What can you usefully reason about T? You know it's one of the two types, but you still have to examine and switch on the type before you can do any useful work, as the interfaces differ radically. That doesn't seem to offer much gain over just accepting a Variant?

By contrast, in .Net, constraints use all of semantics (i.e. And) - VB.Net docs here but note that the rules are the same in C# (and presumably F#).

This could be expressed as:

Public Sub DoSomething (Of T As IDisposable And ICollection) (Collection As T)
End Sub

(the interfaces are hypothetical)

This is much easier to reason about - T is the union of all provided types. All type members of IDisposable and ICollection can be bound directly and shown in Intellisense.

Note that .Net does have a few "general" constraints (in VB.Net, New, Structure and Class). I wonder if a similar approach could be valid here - maybe have an additional Numeric constraint which allows any number type?

FullValueRider commented 3 years ago

Having good tools means that you can still shoot yourself in the foot if you want to. The original example I proposed was to select types that could be incremented by 1. This is not simple to do using either generics or interfaces. The more general point is that because VBA/VB6 is an untyped language which has strong built in hinting, the language desperately needs a more rational way of reasoning about types (in the same way as it desperately needs the ability to easily populate collection object/arrays). In fact it might even be reasonable that types themselves could have their own set of properties such as IsNumericallyIncrementable, IsConcatenatable.

I see a big danger for twinbasic in becoming a basic based C#alike rather than something that has different ways of thinking and methods for doing things. Any solutions about reasoning about types should be a way of bringing such reasoning into compile time checking rather than deferring to runtime checking.

mansellan commented 3 years ago

I'm sure twinBASIC would never turn into a C#alike, because of its fundamental design imperatives (VB.Net's take in italics):

  1. Native compilation : not managed-runtime
  2. Reference counted : not garbage collected
  3. COM-native : not COM interop
  4. Complete backwards compatibility : not sorta-similar-don't-look-behind-the-curtain

I have no doubt that these goals are immutable and non-negotiable, just as they should be IMO.

I'd like to think that twinBASIC can become what Microsoft would have evolved VB6 into, had they not been distracted elsewhere. I'm hoping that means taking the best ideas from the last 20 years, allowing legacy VBx code a natural migration to current best practices.

Some of those ideas might come from the .Net ecosystem, some might come from elsewhere, such as Python or FreeBasic. So long as they are implemented sympathetically and elegantly, and fit neatly into the BASIC language, I don't foresee a problem.

With regard to generics, type theory is properly, intractably hard. We haven't even mentioned generic variance yet * (covariance and contravariance) , largely because it's of lesser value until class inheritance is available. But that took Microsoft until version 4 of .Net to implement, as it's really, really tough to get right. But it's impossible without a rock-solid type system to begin with, as Java's lackluster implementation proved.

* Oops, I just did...

bclothier commented 3 years ago

One problem we need to recognize is that generics are for objects and those aren't objects. In .NET, everything is an object; there are no primitive data types, and there's the concept of boxing/unboxing. That simply won't exist here in twinBASIC.

With a variant structure, we can potentially restrict what will be stored in the structure, so in that context it can make sense to talk about a constrained variant but that wouldn't be "generic" in the same way objects can be "generic"-ized.

We need to consider whether we should require different syntax for reasoning about primitive data types vs. objects; otherwise it can lead to confusion ("hey why can't I have both Long and MySuperDuperLong?"). Based on that point, I think it makes more sense to make a distinction between a constrained variant vs. a constrained generic. However, I would hate to have to pay performance penalty for having to use a Variant behind the scene. To avoid that, the compiler needs to be smart enough to use the most efficient storage for the constrained variable without running into errors due to fitting too big a data or doing something illegal like CByte(255) + 1 and without doing all the runtime checks in every place.

mansellan commented 3 years ago

In .Net, you can't declare constraints to be value types like this:

Sub DoSomething (T As Integer) (SomeValue As T)
End Sub

For the simple reason that value types (be they primitives or Structures) cannot be inherited. So only an Integer could ever satisfy the constraint. So there's no point to making it generic...

I don't see why twinBASIC would work any differently, but I can see there may be some value in a Numeric constraint:

Function Increment (T As Numeric) (SomeValue As T) As T
   Return SomeValue + 1
End Function

Numeric, in this case, would cover the built-in number primitives, which are known to implement a common set of arithmetic , comparison and logical operators.

Edit: Or maybe Number would be a better term.... shrugs

bclothier commented 3 years ago

While they may implement a common set of operators, they aren't the same operators and requires the compiler to emit correct instructions for different data types (e.g. floating precision math is different from integer math, currency/decimal requires special handling, dates may require additional normalization that a double doesn't do, etc.). I don't know if we can statically enforce this at compile-time without paying the runtime penalty of checking the data type, which mean it's no better than using a Variant data type.

For internal calls where the compiler has the full visibility, it probably can infer what data type is being used as the input and thus generate correct assembly instructions. But if it's a Public method, called by an external caller.... then what? As I said earlier, constrained variants likely will only work inside the tB's boundary but not outside. That complicates the scenario where a tB project needs to depend on another tB project since the call would have to be as if it was another COM client and therefore pay the runtime penalty of casting to a variant and back to a constrained variant.

Also, what could this mean?

Function Increment (T As Long Or String) (SomeValue As T) As T
   Return SomeValue + 1
End Function

The + operator is valid for a String, so why not? ;-)

Or what about this?

Function Increment (T As Long Or Double) (SomeValue As T) As T
   Return SomeValue + 0.01
End Function

Do I get a Long back or do I now get a Double due to the implicit conversion?

The point here is that because operators can exist on different data types, this doesn't work as a clue to whether it can be grouped and the implicit conversion rules will further confuse the matter.

A set of overloaded functions would actually fulfill the requirement better:

Function Increment(SomeValue As Long) As Long
   Return SomeValue + 0.01
End Function

Function Increment(SomeValue As String) As String
   Return SomeValue + 0.01
End Function

Function Increment(SomeValue As Double) As Double
   Return SomeValue + 0.01
End Function

Here, it's now more explicit and readable and we have a clearer idea of what to expect when we pass a Double, a Long or a String into function Increment than if we used the generic.

That is not to say that I don't think constrained variants aren't potentially useful. In fact, my huge annoyance with Office's libraries is that they love using Variants everywhere for their arguments but actually expect only one or two subtype of Variants (usually some kind of enum or maybe only a choice of string or a number). I'd love to have this:

Function DoSomething(T As Long Or String) (Whatever As T)
  Select Case Whatever
    Case Is Long Loo
      Return Loo + 1
    Case Is String Yarn
      Return "A piece of " & Yarn
  End Select
End Function

Yes, switching may mean more verbose code but again, we want explicit & readable code. Note that we don't even need to add the boilerplate for checking the data types nor do we need to throw an error when the input is neither a Long nor a String. That can be done at compile-time or even at the runtime for the external COM callers which will see a Variant.

mansellan commented 3 years ago

All excellent points, and entirely correct, but I think we're talking at crossed purposes...

I'm thinking initially of generic constraints, rather than Variant constraints. To be honest, I hadn't really clocked the difference until just now. But I was thinking specifically about type constraints, ignoring entirely for the moment the existence of Variant. Such constraints would naturally not be available over the COM boundary, they would necessarily be twinBASIC-only.

As I commented earlier, I really can't see a use for Or in generic constraints for the reasons you note above. .Net always considers constraints on a type param to be additive, not alternative, as it's the only way to be able to infer any meaningful bindings.

So (disregarding Variant for a sec), this becomes clear:

Function Increment (T As Numeric) (SomeValue As T) As T
   Return SomeValue + 0.01
End Function

Sub TestIt()
   Dim x As Long = 1
   Dim y As Double = 1

   MsgBox Increment (Of Long) (x) ' 1
   MsgBox Increment (Of Double) (y) ' 1.01

   ' Or more succinctly, using inference:
   MsgBox Increment(x) ' 1
   MsgBox Increment(y) ' 1.01
End Sub

The generic is typed as either Long Or Double when the generic is closed. And, per the signature, the return type must match the type param, avoiding any implicit coercion. It would be functionally equivalent to authoring overloads for all number types, returning the same type as taken in.

Now, as for what happens when you call such a method with a Variant-that-happens-to-be-a-Long-right-now, I haven't really thought about enough. Maybe Variants would need to be excluded from using generic methods (not sure if that's workable)...

Edit: Actually, I think if you pass such a method a Variant, the generic should close over Variant, equivalent to:

Function Increment(Value As Variant) As Variant
   Return Value + 0.01
End Function

Which would return 1.01 regardless of whether the supplied value was an integer or a floating point, because implicit coercion. If that were allowed though, a Variant which wasn't holding a number would have to be a runtime error (13: Type Mismatch), which seems to go against the grain quite a bit. Might be easier to just disallow Variants as generic parameters altogether.

mansellan commented 3 years ago

I've been thinking about this some more, and I think this is actually two feature requests:

  1. A generic constraint system
  2. A restricted-Variant system

I've talked about 1 above. 2 could look something like this:

Dim SomeVariable As Variant Of Byte Or Integer Or Long

For 2. I think Or makes perfect sense, but I'm still convinced that it doesn't make sense for 1.

As @bclothier said, it should be possible to avoid a runtime penalty for 2 when it's used internally. I would rather still allow it to be used externally though, even with the associated extra runtime cost (as long as the compiler is smart enough to know the difference and optimise accordingly).

Example 1: Generic method

Public Function DoSomething (Of T As IEnumerable And IDisposable) (SomeValue As T) As T
End Function

Example 2: Restricted-Variant in a non-generic method:

Public Function DoSomething(SomeValue As Variant Of String Or Date) As Boolean
End Function

These could even be used concurrently (not shown for brevity)

mansellan commented 3 years ago

What about a slightly different syntax? Thinking the below could be clearer than the route VB.Net took, and could also accommodate both type and Variant constraints:

Non-Reference type

Public Function DoSomething(Of T) (SomeValue As T) As Long _
   Where T Is Value
End Function

Reference type with interface

Public Function DoSomething(Of T) (SomeValue As T) As Long _
   Where T Is Class And IEnumerable
End Function

Variant

Public Function DoSomething(Of T) (SomeValue As T) As Long _
   Where T Is Variant Of Integer Or Long Or Byte
End Function

One other consideration - in .Net, only the name, return type and parameter types are considered when identifying candidates for overload resolution - generic constraints are not part of the equation. So methods that differ only by constraints are not allowed.

I think this is because of the history of .Net - generics were only introduced in version 2. Given a clean slate as with twinBASIC, I see no reason why constraints could not be included in overload resolution - @WaynePhillipsEA do you think this is so?

If constraints could be used in overload resolution, it could solve the Let / Set problem recently discussed in VB Forums:

Public Class Foo(Of T)

  Dim _backingField As T

  Public Sub Add(item As T) Where T Is Value
     _backingField = item
  End Sub

  Public Sub Add(item As T) Where T Is Class
     Set _backingField = item
  End Sub

End Class
wqweto commented 3 years ago

Btw, prior art re syntax besides VB.Net there is trait constraints in rust with dual syntax as well i.e. inline vs separate where used for more complex expressions.

Greedquest commented 8 months ago

Another view point on this:

If (per the original example) we want to increment x, then rather than constraining x As Numeric

What about x supports the + operator Or x supports the + operator with integer on the rhs