twinbasic / lang-design

Language Design for twinBASIC
MIT License
11 stars 1 forks source link

Extension Methods #43

Open mansellan opened 2 years ago

mansellan commented 2 years ago

Is your feature request related to a problem? C# and VB.Net allow extension methods to be defined, which can add functionality to compiled classes for which you do not have access to the source code. This can open up tremendous capabilities for library authors - LINQ was built around them.

Describe the solution you'd like A means to extend existing types:

Public Extension Sub ToCsv(Source As Excel.Range, FileName As String)
  ' Code to save the supplied range as a CSV file
End Sub

Public Sub TestIt()
  Range("B4:AB250").ToCsv("C:\MyReport.csv")
End Sub

The first parameter accepts an instance of the type to be extended. I've suggested a new keyword Extension here as both the C# and VB.Net syntaxes as problematic. C# uses this on the first parameter, which (IMO) is not very descriptive. VB.Net uses an attribute, which seems even worse, and also allows the extension param to be sent ByRef, which is nonsensical.

Ideally, it would be possible to add not only Sub and Function, but also Property (think Extension Property Get FullName).

Selection of potential extensions should follow standard overload resolution rules, and prefer the most derived type. For example, in C#, the LINQ extension for .Count() on an ICollection can use the Count property (O(1)) rather than having to iterate all items (O(n)) as it would for an IEnumerable.

Additional Context This has been mentioned several times, but I don't think an issue has ever been raised for it.

bclothier commented 2 years ago

@mansellan do you consider the badly named twinbasic/lang-design#7 to address this request? If so, I could rename it to clarify the scope for that issue.

FullValueRider commented 2 years ago

This touches on something I posted on Discourse last night.

It may sound ungrateful, but I think @WaynePhillipsEA missed on opportunity designing the twinBasic compiler by not implementing universal function call syntax.

https://en.wikipedia.org/wiki/Uniform_Function_Call_Syntax

The last time I asked about this @WaynePhillipsEA confirmed that the compiler wasn't capable of such shenanigins.

In last night's discourse comment I wondered if a more limited form of such syntax might be possible for twinBasic by introducing an alias/new keyword for Function (lets call it Method). You would define a Method in exactly the same way as you would a Function BUT the use of the Method keyword would allow the following

Public Method DoThis(byxxx p1 as <xxxx>, byxxx as <yyyyy>) as <zzzzz>

'now in code elsewhere

myVar = DoThis(myP1, myP2)

myVar myP1.DoThis(myP2)

If myP1 DoThis myP2 Then    '  Obviously DoThis would be written to  return a boolean in this case

I hope you would agree that the Method approach would add a lot of benefit to twinBasic.

I'm not sure how this would work from an intellisense perspective but I'm sure its a solvable problem (in the long term)

mansellan commented 2 years ago

@bclothier ah yes, that's why I couldn't find it. Feel free to close this one.

@FullValueRider interesting, hadn't come across those before. From the (limited) reading I've just done, it seems very similar to (albeit more flexible than?) extension methods. Instinctively, it feels almost too flexible for twinBASIC, perhaps going against the "one obvious way to do something" ethos? Apologies if I've misunderstood.

FullValueRider commented 2 years ago

Well currently, there are no obvious ways to add new operators or do extension methods in VBx. @WaynePhillipsEA may have something up his sleeve, but who knows. The Method approach (if it were possible) would leave legacy VBx untouched, but offer new and exciting ways to do things for those who so wish. Perhaps the Method approach would be a significant differentiating factor for twinBasic? There is nothing new to learn in terms of setup of Methods, its entirely the same as Function definitions, just the way in which Methods would be used.

KDGundermann commented 2 years ago

Reading https://en.wikipedia.org/wiki/Uniform_Function_Call_Syntax

to be called using the syntax for method calls (as in object-oriented programming), by using the receiver as the first parameter, and the given arguments as the remaining parameters.

We already have this in VB, as VB is (partial) object-oriented. You define classes and methods/functions on this class, so you call

    Dim v1 as new Vector
    Dim v2 as new Vector
    v1.Init(x:= -1, y:=4)
    v2.Init(x:= 5, y:= -2)
    v1.Add(v2)

and you may even daisy chain them if Add() returns the instance of itself: v1.Add(v2).Add(v1)

KDGundermann commented 2 years ago

C# and VB.Net allow extension methods to be defined, which can add functionality to compiled classes ..

C# has also the concept of partial classes, so you can enhance an existing class with additional methods. But as most of the classes in .Net are NOT partial they had to dodge this by creating the "Extensions"

mansellan commented 2 years ago

partial classes have to be in the same assembly (and namespace) as each other, you can't use them to extend classes in another assembly. They mainly exist to allow designers like the WinForms designer to keep their codegen out of harm's way.

bclothier commented 2 years ago

IMO, partial classes solve a different problem, which may be more suitable for discussion in a separate issue. I can see this being useful for custom controls for example.

Regarding the UCS, my main issue is that a large amount of existing functions are designed to work with variants. Let's use Left() as an example. This accepts a Variant and returns a Variant.

Dim d As Double
d = "123"
Debug.Print Left(d, 2) '12
Debug.Print d.Left(2) '?!? That seems weird having a Left() on a Double...
Dim v As Variant
v = d
Debug.Print v.Left(2) 'Ok...
Dim s As String
s = d
Debug.Print s.Left(2) 'Is it really the Variant-returning Left() or actually the String-returning Variant?

As we disallow the Variant from generics, we'd need to define the equivalent function for each data type, and understand that the Double.Left() is the same thing as String.Left() which is the same thing as Left$() which is not the same thing as the Left() / Variant.Left().

Personally, I'd prefer that the Double.Left() didn't exist because we are actually supposed to convert it to string first and deal with the formatting (do we use commas or dots for decimal separator?) before running it through the String.Left(). So It would be more correct to do this:

Dim d As Double
d = 123
d.ToString().Left(2) '12

Which is basically a long-winded and roundabout way of saying that we shouldn't do that for Variants and there must be definitions for strong-typed functions in lieu of the traditional Variant-loving functions. That is especially important considering that there is a rule regarding resolving default member on a Variant containing an object, which can lead to insanity.

mburns08109 commented 2 years ago

I'll reiterate here my question from the Custom Enumerators thread... How would or could Extension Methods work in a COM/ActiveX world? Would they not violate the COM interface contract? ...and in a case like considering adding an .Exists(key) method to something like a collection (or any other iEnumerable) class, how would any extension method gain access to the COM Class' internals (like the collection class members)?

...or, for that matter, partial classes - same set of questions for the COM/ActiveX world - CAN they be fashioned to work at all for COM?

FullValueRider commented 2 years ago

Its probably worth drawing attention to 'Implements Via' which would seem to allow some of the goals of extension methods/ partial classes etc. The availability of 'implements via' makes it very easy to create a new class from an existing class and to add/override methods.

From my perspective I feel that I keep seeing too much discussion about 'Net'ifying twin basic rather than building on the strengths of VBx.

bclothier commented 2 years ago

Keep in mind that extension methods would make it easy to surface functions that acts on a specific data types, not just classes. I find it far easier to discover functions on a variable than to search a big list of all possible functions:

Dim d As Date

d = ConvertToUTC(Now())

I have to know that there's a function named ConvertToUTC somewhere, and I hope that there's no naming collisions. In contrast:

Dim d

d = Now().ConvertToUTC()

Immediately after typing Now()., I can get an intellisense of all date-related functions, both built-in and custom which is much easier to explore than using the object browser. Because they are scoped to only particular data types, that really helps on cutting down on the global namespace pollution. How that can not be BASIC?

FullValueRider commented 2 years ago

I'd argue that that utility is a function of intellisense rather than the language itself (not denying the utility of the discoverability aspect). There would be nothing to stop the intellisense in twinBasic being constructed such that hovering over a variable name pulled up all methods that could be used with that type as the first argument (or listed such methods in a pane).

wqweto commented 2 years ago

How would or could Extension Methods work in a COM/ActiveX world? Would they not violate the COM interface contract?

This would most probably be solved the way generics are done in TB so it's doable.

Don't see how would extension methods end up in public interface (in a type library) so keeping these within project scope allows for all the magic to happen seamless because extension methods are just a syntactic sugar over their parameters i.e. o.MyMethod(p1, p2) vs MyMethod(o, p1, p2) and nothing more (extension methods have no access to class internals/private variables for instrance).

What I would very much like to see is extension methods over UDTs i.e. MyData.Length() for instance. That would bring UDTs into 21st century IMO akin to so many of the modern languages which define a UDT and then define bunch of regular functions which deal with it and can be called with dot notation.

bclothier commented 2 years ago

There would be nothing to stop the intellisense in twinBasic being constructed such that hovering over a variable name pulled up all methods that could be used with that type as the first argument (or listed such methods in a pane).

But there are problems with that approach:

1) hovering requires me to leave the keyboard and switch to mouse. That's a huge time vampire right there. 2) That is quite a disruption to the typing since I now have to type the variable first, find the function, move cursor back to before the variable, then add the function. That's lot of time wasting. 3) I personally find it much easier to read from left to right than doing it the mixed way that non-extension method would require.

MyBag.Sort().ToArray()

reads more naturally than:

CArray(Sort(MyBag))

or

CArray(MyBag.Sort)

In those 2 latter examples, I have to mentally reshuffle the actual steps of operation to what is logically done; the first example, there is no mental reshuffling needed; the order of operations is the same logical operations that will be done.

How would or could Extension Methods work in a COM/ActiveX world? Would they not violate the COM interface contract?

This would most probably be solved the way generics are done in TB so it's doable.

IINM, generics simply aren't exposed to COM. I would expect the same for extension methods; the non-extension version of the same method can be exposed to the COM (e.g. CArray(Foo) instead of Foo.CArray() for VBx consumers.

mburns08109 commented 2 years ago

So then how would one be able to usefully implement a useful/global collection.Exists(x) (or, as discussed, iEnumerable.Exists() ) as an extension method when:

extension methods have no access to class internals/private variables for instrance

and

Don't see how would extension methods end up in public interface (in a type library)

...?

mansellan commented 2 years ago

It could look something like this:

Public Shared Function Contains(enumerable Extends IEnumVARIANT, item As Variant) As Boolean

  For Each entry In enumerable
    If item = entry Then
      Return True
    End if
  Next

  Return False
End Sub  

And then to call it, one of two syntaxes:

  If myEnumerable.Contains(6) Then
     ' This form would only be available internally to tB
  End If

or

  If Contains(myEnumerable, 6) Then
    ' This form can be COM-visible
  End If

The above introduces two new constructs - Shared for "static" methods, and a new use for Extends, which is an alternative to the attribute that VB.Net used. Either way, the effect is that the first parameter into the Sub/Function is the "thing to be extended". It's not possible to access instance state here, so you can only interact with it though it's public API (in the example above, by For Eaching through it).

The new syntax is just hypothetical, it would have to be debated and agreed.

PS: IME, it gets even more interesting when you consider extensions to closed generic types. But that's a question for another time (and depends how awesome Wayne has been with his type system).

mansellan commented 2 years ago

Actually, might not even need the Shared thing, could just insist that extension methods can only be defined in standard modules (less flexible, but that's how VB.Net did it).

mansellan commented 2 years ago

I'm guessing here based on my limited knowledge of compilers, but I'm assuming that the "no access to instance state" is fairly insurmountable at the compiler level. Full instance-level extension of types that you don't "own" sounds pretty hairy (how would you even have a syntax for accessing other DLLs private fields?), but extension methods are basically just syntax sugar as seen above (just an alternative caller syntax), and "should" be pretty easy to resolve:

Dispatch the first one that is true!

Aside: .Net decided to stop at methods only. No properties, no anything else. Extension properties would have avoided much unpleasantness. Imagine a class that has FirstName and LastName properties, but the extension method for FullName has to be a Function for some bizarre reason.

mburns08109 commented 2 years ago

Geesh - and here I'd hoped that the Dictionary.Exists was a tad more efficient than a Foreach loop thru the possible items (and that we'd be able to echo that "better efficiency". But if that's a limit on the use of extension methods, then I may go back and renew my thinking on this.

Hmm... perhaps we add a new iDetectable interface, and implement it on the underlying collection object in tB? interface iDetectable Function Exists(key) as boolean Function ExistCount(Key) as long Function ExistsSelect(Key, InstanceNumber) as variant '<--- throws error if InstanceNumber > ExistsCount for Key ...etc.,. End Interface

...something to think about as it wouldn't "pollute" the iEnumerable interface that way. or... can an interface implement another interface? if so then iEnumerable implements iDetectable? (mulls further...)

mansellan commented 2 years ago

Noo.....

The beauty of extension methods is that they tie into the basic method resolution system. That's kinda what I was alluding to with my PS... You could also have an extension to IDictionary (Of Tkey, Of TValue) that took full advantage of its quick-lookup methods. The IEnumerable extension would then just be a fallback in case it wasn't a dictionary. Linq has a whole hierarchy of overloads to select the most efficient solution.

bclothier commented 2 years ago

Just to emphasize this --- what I see as biggest advantage with the extension method syntax is that the COM-compatible form would most likely result in a name collision:

  If Contains(myEnumerable, 6) Then
    ' This form can be COM-visible
  End If

Which leads to either MyEnumerableContains(myEnumerable, 6) or MyEnumerableHelper.Contains(myEnumerable, 6). Both form are quite verbose and can actually hurt the readability because you have to start mentally skimming to find the relevant keyword (Contains in this case) to understand what the code is doing. The third alternative is to use prefixes e.g. myeContains(myEnumerable, 6), which now demand that the reader know what the prefix mye stands for.

The first form where the Contains() appears after the data type means that you no longer need to disambiguate between MyEnumerate.Contains() vs. OtherGuysEnumerable.Contains(); it's simply just Contains() and you don't have to worry about the naming collisions anymore. Now we have to do much less thinking about naming and get about the business of solving the business problem. (see what I did there...)

Geesh - and here I'd hoped that the Dictionary.Exists was a tad more efficient than a Foreach loop thru the possible items (and that we'd be able to echo that "better efficiency". But if that's a limit on the use of extension methods, then I may go back and renew my thinking on this.

If you are concerned about efficiency, you are in the wrong place. Extensions help us abstract from the implementation and work with closed types that we don't own. Efficiency / performance is an implementation detail and in this case, you'd be writing your own enumerable class or using a class that already implements the native support for doing a fast existence check.

As @mansellan said, you'd probably want to work with IDictionary rather than more generic IEnumerable if you needed the fast existence check. But that's a different solution than what is being discussed here.

mansellan commented 2 years ago

@mburns08109 An enumerable interface (of whichever name) really only guarantees that you can move forwards though a sequence. Given that restriction, you can't implement Contains quicker than O(n). It's just a (crappy) fallback.

But you can select a better method. If your type already has a Contains method, you can just delegate to that, on the assumption that it's already optimised. And that's what extensions, coupled with a rock-solid overload resolution strategy can do.

Greedquest commented 2 years ago

@mansellan @mburns08109

For example Exists could have several overloads based on the input type; if the container object implements ISorted (a hypothetical trait that guarantees the list is sorted) then do a binary search for key. If it implements IHashSet then do key.Hash() and lookup directly. Or if it implements IContains then call the contains method. Otherwise fall-back on IEnumerable iteration.