Open BCSharp opened 4 years ago
Traditionally (IronPython 2), this was solved by using Unicode character U+00F8 trick in the parameter name, which made the name a valid identifier in C# but not in Python. This trick is no longer working with IronPython3 because Python 3 identifiers can contain Unicode letters too.
Actually IronPython 2 allows unicode identifiers as well so the trick just makes it harder to use.
I agree the whole thing is a confusing mess. It seems like not binding to the ParamArray/ParamDictionary parameters would be entirely reasonable. Here are a few questions that come to mind:
ParamDictionaryAttribute
is used outside IronPython.KeywordOnly
somehow makes its way back to C# and is used as a dynamic
? I guess that's also a question if I define a method like def test(*, a=1): pass
. Maybe the whole thing is already broken, but it would be unfortunate if a method couldn't be invoked when using dynamic
.
- Could a language opt-in to such a behaviour or would it be stuck with whatever we choose?
I suppose we could add a (few) new protected virtual method(s) to OverloadResolver
to allow the language to override the default opt-out, but it still does not resolve the issue that currently the DLR does not handle such opt-ins well. I have just found another place in the DLR that assumes no binding to args/kwargs.
- Would PositionalOnly/KeywordOnly need to be attributes of the DLR or could the implementation be done at the language level (in IronPython)?
I was thinking of adding it at the DLR level because handling of keyword arguments is done primarily by DefaultOverloadResolver
but also partly in OverloadResolver
. With the extra protected virtual method approach those attributes can be introduced at the IronPython level, though call failure reason propagation up to OverloadResolver
may be tricky (I assume that the existing protected virtuals are part of the unchangeable official API, but that API is somewhat constrained with respect to handling comprehensive error reporting). So some support at the DLR will be needed; I doubt whether support for PositionalOnly/KeywordOnly could be done without any whatsoever change to the DLR.
- Do keyword-only attributes exist in languages other than Python? If not could it be implemented at the language level or would the DLR need to be adapted to allow this? Makes me wonder if the
ParamDictionaryAttribute
is used outside IronPython.
- What happens if a method using
KeywordOnly
somehow makes its way back to C# and is used as adynamic
? I guess that's also a question if I define a method likedef test(*, a=1): pass
. Maybe the whole thing is already broken, but it would be unfortunate if a method couldn't be invoked when usingdynamic
.
Hm, I don't know how to provide keyword arguments to a dynamic method invocation in C#. Perhaps a way to invoke could be though ObjectOperations.Invoke
? The existing Invoke
overloads do not take keyword arguments either, but it could be added?
Since a builtin method after making its way back to C# will be a BuiltinFunction
, maybe another way it can be invoked is though one of the Call
overloads, made public for this purpose? Those guys do take kwargs.
Alright, I think we could go ahead with excluding binding of ParamArray/ParamDictionary parameters. I would probably make the PositionalOnly/KeywordOnly
attributes a separate proposal (I'd hate to add public attributes and then later realize it wasn't what we really wanted).
Hm, I don't know how to provide keyword arguments to a dynamic method invocation in C#.
I was thinking of something along the lines of:
var engine = Python.CreateEngine();
var scope = engine.CreateScope();
engine.Execute("x = sorted", scope);
dynamic sorted = scope.GetVariable("x");
var iterable = new int[] { 1, 3, 2 };
Console.WriteLine(string.Join(", ", sorted(iterable, reverse:true)));
It looks like this currently works (which is nice) so we'd have to make sure not to break it.
Perhaps a way to invoke could be though ObjectOperations.Invoke?
Yes, it seems like some overloads in ObjectOperations would be useful:
public dynamic Invoke(object obj, object[] args, IDictionary<string, object> kwargs);
// InvokeMember
// CreateInstance
Seems to be one of the recommended way to call things (https://stackoverflow.com/questions/8970458/ironpython-call-method-by-name). I don't see how you can invoke the reversed sorted example via ObjectOperations
right now so even if we don't go ahead with the proposal, some overloads may be worth adding.
Since a builtin method after making its way back to C# will be a BuiltinFunction, maybe another way it can be invoked is though one of the Call overloads, made public for this purpose?
Would be best to have something more agnostic and not rely on the fact that it's a BuiltinFunction
. Invoking a user-function with keyword-only arguments should work the same way.
it seems like some overloads in ObjectOperations would be useful
This is my preferred approach too (over IronPython specific), but I am having second thoughts about simply overloading Invoke
(and InvokeMember
, CreateInstance
). While the signature you suggest is exactly what I had in mind as the most convenient, it would create a cornercase incompatibility with the current behaviour. Since currently parameters
is a param array, providing two parameters of type object[]
and IDictionary<string, object>
is now putting them in a 2-item parameters
array representing two positional arguments, but with the overloads, they would be effectively splatted.
To retain existing behaviour, existing user code would have to change from
engine.Operations.Invoke(f, arr, dict)
to something like
engine.Operations.Invoke(f, new object[] { arr, dict})
Are we okay with that?
Otherwise, maybe a different method name, SplatInvoke
etc.?
I would probably make the
PositionalOnly/KeywordOnly
attributes a separate proposal
Fair enough.
Are we okay with that?
Although it seems unlikely to occur it could indeed break people. I'd be fine with a separate name although I can't think of a good one. I think prefixing with the existing method name would be better for discoverability. I'm not sure I like Splat
(I know it comes up in a few places, but mostly internal). These seem kind of verbose, but could be an option: InvokeWithKeywords
or InvokeWithDictionary
.
There is a certain inconsistency in handling parameters marked with
ParamArrayAttribute
orParamDictionaryAttribute
. It is reflected in both comments in the DLR code, as well as code itself. Since this is a fundamental concept in the overload resolution, I want to discuss it explicitly and agree on a principle, before changing the DLR behaviour in one way or another. It would be very useful to know which behaviour is desirable and which should be considered a bug. Once this principle is agreed on and implemented, it may be difficult to reverse on it in the future, since many parts of the overload resolution code depend on it.Let's look at the current situation first. For instance,
ParameterWrapper.cs
reads:On the other hand, a comment on
ParamDictionaryAttribute
reads:This suggests that it should be possible to pass such param dictionary and/or param array somehow. Since passing by keyword should be forbidden, this leaves that passing by position should be supported. But this leads to logical inconsistencies, which I describe a little bit later below.
On the actual code side, some parts of the code assume that passing param dictionary/array by keyword is allowed, and some assume that it is forbidden. This leads to erroneous behaviour, ranging from unexpected results, through confusing error messages, to exceptions thrown. Example (from IronPython):
Method
M560
has signaturepublic void M560(int x, [ParamDictionary] IDictionary<string, int> y, params int[] z)
and the call should succeed. Instead, andIndexOutOfRangeException
is thrown:Interestingly, the exception is not thrown during overload resolution, but during the actual call itself, because the call site has been constructed incorrectly: it contains three keywords for keyword arguments (
z
,b
,c
) but only two values (5, 6). The value for parameterz
is taken over by an actual param array to accommodate additional positional arguments (2 and 3).Regarding passing by position, the current DLR code does accept param array passed by position:
The last two lines show inconsistency in handling matching type values. I personally think it is unfortunate for a dynamic language, but it matches C#.
Param dictionary, on the other hand, is not accepted when passed as a positional argument:
Inconsistencies by allowing keyword binding
Regardless the error, method
M560
raises a question what the desired behaviour should have been. For instance, one may say thatz=1
should not be interpreted as providing an argument forparams int[] z
because1
is notint[]
, and instead it should be put in dictionaryy
. But what if the argument (which could be just a variable) were of typeint[]
? Then we would have a strange behaviour that depending on the type of the argument, a named argument is being processed quite differently. This may be OK in statically-typed languages, but in dynamic languages the type of a variable may not be obvious or may be undetermined at the call site. Finally, such resolution would significantly complicate the current DLR algorithm, which does arity based resolution first, then type based (and determining which positional and keyword arguments go where is part of method arity determination).Another option could be to always assume that
z=some_value
binds to the param array and report an error in the case above. Such error would be confusing at best (something likeexpected int[]
gotsomething
), but also eliminate a possibility for passing arbitrary keyword arguments to be collected in the param dictionary. Traditionally (IronPython 2), this was solved by using Unicode character U+00F8 trick in the parameter name, which made the name a valid identifier in C# but not in Python. This trick is no longer working with IronPython3 because Python 3 identifiers can contain Unicode letters too. A better solution would be to use a dedicated parameter attribute (sayPositionalOnlyAttribute
) to explicitly prevent binding by name. This could work but it would mean that some methods with param arrays behave differently than other, and it may be confusing to the user especially if there are several methods in the method group to choose from.Similar reasoning can be done for keyword binding of param dictionary. In a such case, the option to always assume that
y=some_value
(wherey
is a param dictionary) always binds by name to the whole param dictionary is even more untenable, because an often assumption is that a param dictionary as a sole parameter should be able to catch any keyword arguments.So my conclusion is that the comment in the DLR that params arrays & dictionaries don't allow assignment by keyword is the best way to resolve the ambiguities and the code should be updated accordingly.
Inconsistencies by allowing positional binding
Let's assume the following signature:
Arguably, such method should be able to take an arbitrary number of positional and keyword arguments thus never throw a type error.
Further let's assume that dynamic variables
kw
andar
contain values typedIDictionary<object, object>
andobject[]
respectively, anda
a value of any other type. What should be the result of the following calls:Expected:
args[0] == a
,kwargs
empty, nothing strange yet.kwargs == kw
,args
empty? (sincekw
in sin position 1 ofkwargs
). Orargs[0] == kw
?kwargs == kw
,args[0] == a
? (sincekw
is in position 1 ofkwargs
). Orargs[0] == kw
,args[1] == a
?kwargs == kw
,args[0] == a
? By what rule? Orargs[0] == a
,args[1] == kw
?kwargs == kw
,args[0] == ar
,args[1] == a
? (sincekw
is in position 1 ofkwargs
, butar
is in position of 2 and not bound toargs
). Orargs[0] == kw
,args[1] == ar
,args[2] == a
?args == ar
? By what rule?ar
is in position 1 so there is a type conflict betweenar
andkwargs
. Or maybeargs[0] == ar
, like in the first example. At least that would be consistent. And what ifar
had such a value thatar[0] == kw
? Shouldkwargs == kw
(positional binding) orargs == ar
?I suppose this must be
args[0] == ar
andargs[1] == a
, but it is quite different handling ofar
than in the previous example.I suppose this must be
args[0] == a
andargs[1] == ar
. But a is in position of 1, so it does not bind positionally. Nor doesar
, although the type matches.kwargs == kw
, andargs == ar
, obviously, if positional binding is allowed. Otherwiseargs[0] == kw
,args[1] == ar
.args[0] == ar
,args[1] == kw
? This means that positional binding is suspended. Quite different outcome than above.Things get way more complicated with more normal or optional parameters and mixing position and keyword arguments.
I see the only simple and consistent way of handling it by prohibiting positional binding to param arrays and param dictionaries. In the examples above, the types of
kw
andar
were known, but in a dynamic case, a call site may get its arguments passed on from some unknown source, of an unknown type. It would be really confusing that it may change how the overload resolution is performed.Prohibiting positional binding means that param dictionaries and param arrays are always constructed implicitly from unassigned/superfluous keyword or positional arguments. If a language has a means (i.e. a special notation) for splatting (like Python), then there is an explicit way for the caller to provide args or kwargs in their array or dictionary form. If a language does not have such syntax, such calls are not possible. I see no reason why DLR should go out of its way to support such calls if the language itself does not.
I do realize that it is a radical call, not the least because the DLR for years already supported positional argument for param arrays (though not for param dictionaries). So a more compromising solution could be to keep it supported, but explicitly prohibit positional argument for param dictionaries. After all, while param arrays are being used by .NET, param dictionaries are specific to the DLR, so a dynamic language implementation is in full control when and if to use them. The price for this option is a more complicated overload resolution ruleset.
One more thing. Generalizing this principle of prohibiting positional binding to be available to other parameters would be very useful. Specifically I was thinking of providing a dedicated attribute (e.g.
KeywordOnlyAttribute
) to be applied in such cases. This would be a counterpart toPositionalOnlyAttribute
.For instance, a Python builtin function
sorted(iterable, /, *, key=None, reverse=False)
could be implemented by a method with the signature:I am not sure about the names of the attributes, though. Technically,
[PositionalOnly]
would actually mean[ProhibitKeyword]
. Also,[NonKeyword]
would be shorter. Similarly,[KeywordOnly]
is informative, but actually means[ProhibitPositional]
which is a mouthful; another option[NonPositional]
is not the shortest and less informative.[ProhibitPositional]
could also well fit a case when applied on a param array implemented by a dynamic language, if the DLR principle turns out to stay that param arrays by default are allowed to be bound by position.Putting both
[PositionalOnly, KeywordOnly]
on the same parameter sounds contradictory and should (?) be prohibited (IronPythonAnalyzer job), because it would made the parameter inaccessible to the language. In IronPython, there is already a way of making things inaccessible to the language:[PythonHidden]
that is applicable to a wider range of entities but not parameters, so I doubt whether hiding parameters in this fashion would be useful.