twinbasic / lang-design

Language Design for twinBASIC
MIT License
11 stars 1 forks source link

Non-Variant ParamArrays #15

Open mansellan opened 3 years ago

mansellan commented 3 years ago

Is your feature request related to a problem? In VBx, a ParamArray must be declared as arrays of Variant, e.g:

Sub DoSomething(ParamArray SomeValues() As Variant)
  ' Some code
End Sub

The requirement for the array element type to be Variant appears to be unnecessary. Can it be elided?

Describe the solution you'd like

Ability to declare a ParamArray of non-Variant elements, e.g:

Sub DoSomething(ParamArray SomeValues() As Long)
  ' Some code
End Sub

Describe alternatives you've considered It's certainly possible to write code within the method body to check the Variant subtype and reject any unexpected subtypes. But if an array of homogeneous types is expected, it would be nice to avoid the cost of using Variant.

Additional context Not sure if this is possible to achieve over the COM boundary (at least, not without runtime cost), that's dictated by how expressive COM is in this area, and I don't have sufficient understanding.

WaynePhillipsEA commented 3 years ago

We could certainly offer this internally in tB, but no, COM/OLE does not allow it. For example, you can't create a type library using a ParamArray type other than Variant.

From a technical point of view, it would be quite easy to implement.

mansellan commented 3 years ago

Thanks @WaynePhillipsEA, that makes perfect sense. COM predates this kind of thing.

Would it be possible to expose such a method to COM as accepting Variant, but have the compiler insert runtime checks to ensure that the Variant held the expected subbtype? If I understand correctly, that would have the same effect (and cost) of a human-wriiten method that accepted an array of Variant, where each was immediately checked in a loop to determine if they were of subtype {whatever}.

Obviously low-priority, as it's just avoiding source bloat whilst saving nothing on runtime cost. But source elegance can attract fans, so maybe still worthwhile?

bclothier commented 3 years ago

IMO, a group of like-typed inputs are probably best implemented as IEnumerable rather than as ParamArray. ParamArray has its uses when the contents could vary (e.g. string formatting for example) but an IEnumerable parameter is more readable & discoverable than a ParamArray.

Kr00l commented 3 years ago

When a non-variant ParamArray can be achieved only for internal use, then you have the discipline to pass only a certain type to it. In worst case you can Debug.Assert and ensure the VarType on each array element is expected. The Variant "penalty" is negligable for parameters.. it's certainly never a very high amount. Just my 2 cents.

Kr00l commented 3 years ago

Also additional argument: It can confus newcomers to be able to use something only internally but not externally.

WaynePhillipsEA commented 3 years ago

I've had an idea on this one. It can be done, and rather neatly too. More details to follow in a few days!

WaynePhillipsEA commented 3 years ago

So here's my thoughts on how we can do this. Firstly, let's look under the hood at how the two function signatures would look from a C perspective:

Public Sub DoSomething1(ByVal A As Long, ParamArray B() As Variant)
Public Sub DoSomething2(ByVal A As Long, ParamArray B() As Long)

HRESULT __stdcall DoSomething1(int A, tagSAFEARRAY** B);
HRESULT __stdcall DoSomething2(int A, tagSAFEARRAY** B);

As you can see, the two function signatures are identical in C, and thus can be considered compatible. This is because arrays are passed around as SafeArrays, which happen to know their own type, much like a Variant.

What this means for us is that twinBASIC can internally offer a ParamArray of Longs (or whatever types we like), and externally expose the type to COM as a regular ParamArray of Variants. The magic will happen inside the prologue of DoSomething2, where twinBASIC can insert a tiny runtime check to ensure the passed in SafeArray is of the expected type (an array of Longs), and do a runtime conversion if it is not.

This would mean that for twinBASIC internal calls, we get an optimized code path when passing a ParamArray of Longs, and when COM callers pass the more usual ParamArray of Variants, there's a small performance hit at the start of the routine where it is converted automatically.

The main caveat here is that arguments are passed to a regular ParamArray in a ByRef manner, so if the value of the ParamArray elements are modified inside the procedure, then the change is propagated to the caller input variables when possible. With a ParamArray of Longs (or any non-Variant type), they will effectively be passed ByVal.

wqweto commented 3 years ago

I was wondering what is C/C++ signature of Public Sub DoSomething1(ByVal A As Long, B() As Variant)? Is it not again HRESULT __stdcall DoSomething1(int A, tagSAFEARRAY** B)?

I mean C/C++ function does not know anything about varargs at callsite. This "feature" is communicated purely by the typelib (the vararg attribute) so that the client can collect all varargs in a local B() As Variant variable before calling DoSomething1(int A, tagSAFEARRAY** B) which knows nothing about vararg source of the B array it gets.

Then what is the C/C++ signature of Public Sub DoSomething1(ByVal A As Long, B() As Long)? Does it differ from the Variant array?

WaynePhillipsEA commented 3 years ago

@wqweto Yes, they all have the same C/C++ signature: HRESULT __stdcall DoSomething(int A, tagSAFEARRAY** B)

The difference is purely in the contract that is made via the type library. By using the type information from the type library, the caller makes a guarantee what it will pass into the B argument will actually be a SafeArray of Longs or Variants etc based on that contract, and so the DoSomething implementation doesn't need to check that the caller actually passed an array of the correct type.