Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

How to provide coercers for new types in higher language versions #137

Closed lizmat closed 4 years ago

lizmat commented 4 years ago

At the moment I do not see an easy way to provide a coercer for a new type in higher language versions. Case in point: adding a Dict / Tuple type to the core of 6.e: https://github.com/perl6/problem-solving/issues/135 .

Adding Dict and its associated methods, turned out to be trivial (after some previous work on Map by @vrurg), but the hard part: adding a .Dict coercer to Any, turned out to be difficult. The obvious:

{
     use MONKEY-TYPING;
     augment class Any {
         multi method Dict(Any:) { Dict.new(self) }
     }
}

appears to fix the problem, but does not. Because it only augments the Any class itself. But important classes such as Hash and Map already have been composed, so they do not know of the existence of the .Dict method and will tell you so when trying to call it.

lizmat commented 4 years ago

This is of course a specific case of a more general problem, namely that augmenting a class does not inform all of its subclasses that an augment has been done. That is a known issue that would require a massive re-think, because currently classes know from which class they inherit, but classes do not know which classes inherit from them. This to prevent circularity issues.

Fixing that problem, would also be a solution to this issue. But may be a long way away.

lizmat commented 4 years ago

Another, more short-term reachable solution, is to create an API for coercion of objects, similar to the way smartmatch is implemented using .ACCEPTS.

Previous ideas about this involved allowing you to create a .COERCE method in your class. Whenever a lookup of a method like this:

%hash.Dict

would fail, it would take the name of the method, look up any known types of it, and if successful, look if that class provides a .COERCE method. If that is the case, then it would do a:

Dict.COERCE(%h)   # class.COERCE(invocant)

and install the selected method in the invocant's class:

Hash.Dict   # invocant.WHAT method.^name

This way subsequent calls with that coercer would not need to take the long path.

lizmat commented 4 years ago

Another, possibly even shorter term solution, would be to somehow re-compose all classes at the end of each setting compilation. This could be done manually by keeping a list of types and loop over them in a BEGIN block in the respective core_epilogue classes. This would make creating the setting a (little?) slower, but it would create a quick solution to this problem.

lizmat commented 4 years ago

Turns out re-composing affected classes, breaks parameterization of classes, e.g:

raku -e 'use v6.e.PREVIEW; my %h{Any} = 1,2,3,4;'
===SORRY!===
Cannot invoke this object (REPR: Null; VMNull)
vrurg commented 4 years ago

This is another view to what I was worried about in #104.

With the particular case of Dict (and, possibly, Tuple), maybe it'd be better to move it into core.c? According to the guidelines, addition of a new class doesn't break backward compatibility and results in a new non-conflicting method on Any. Thus, inclusion into CORE.c should be ok and perhaps even recommended.

With regard to a generic coercion mechanizm, I have a feeling that a discussion on custom coercions did take place somewhere, but can't remember now where exactly. Perhaps, we could've found few interesting ideas over there.

Anyway, of the all possibilities mentioned here the COERCION method appeals the most to me. It allows to do the best job on incapsulation as only the class knows the best how it can be created from another class if it ever can. Thus, if I got the idea correctly, in the following case:

sub foo(Dict() $d) { ... }
foo(%h);

What would eventually happen is foo(Dict.COERCION(%h)), right? Or, considering how things are currently done, the full code would rather be like:

foo(
    %h.^can("Dict") ?? %h.Dict !! Dict.COERCE(%h)
);
lizmat commented 4 years ago

Perhaps. Maybe COERCE should return a Callable to be installed in the method cache of the invocant, and then called.

vrurg commented 4 years ago

Why not just dispatching to .COERCE? I don't see what we'd gain from this approach. In either case we and up invoking a method at run time. But return of a Callable might complicate cases when multidispatch of the coercion method is needed.

vrurg commented 4 years ago

BTW, another argument in favor of .COERCE doing immediate coercion. Say, we have two classes and at least one of them is long-named. Say, A::Foo and B::Bar. How do we normally write a coercer from B::Bar into A::Foo? Well...

So, A::Foo.COERCE($b-bar-instance) looks totally correct. Except that, perhaps, the method name is better be changed to FROM. Then it would look totally fine:

A::Foo.FROM: $b-bar-instance

Thus, we'd have a solution for when already existing class need to know nothing about newer ones. But this won't prevent us from having standardized coercion path.

lizmat commented 4 years ago

Because the .COERCE will only be called after a normal dispatch of invocant.Dict fails? And you don't want to do that all of the time.

alabamenhu commented 4 years ago

With regard to a generic coercion mechanizm, I have a feeling that a discussion on custom coercions did take place somewhere, but can't remember now where exactly. Perhaps, we could've found few interesting ideas over there.

I submitted the problem solving issue back a while ago. Originally, it seems, the idea was that if you had classes A and B (and were in different files, so stubbing was impossible), you would handle coercion in both directions via the one that was defined later, e.g.:

class A { … }

class B { 
  method A { … }               # coercion from B to A
  multi method new(A $a) { … } # coercion from A to B
 }

That was never implemented, and honestly at this point new doesn't seem as nice. My vote was for calling the method FROM instead of COERCE, but regardless whether it's the best way to do things for Dict/Tuple, it's definitely useful for many other circumstances1. I was hoping to get a chance to get around to implementing it myself, but I haven't quite yet had the time to. I probably won't be able to until February =\


  1. in particular, being able to say something like method foo(LanguageTag() $tag) and calling it with a plain string. Right now, the best approach is using multimethods or some temp variables for coercion which adds to code clutter.
vrurg commented 4 years ago

Because the .COERCE will only be called after a normal dispatch of invocant.Dict fails? And you don't want to do that all of the time.

I would suggest that FROM (let me name it this way) should've been the way to have things done from the start. Frankly saying, I was always confused about Any knowing of nearly every of its descendants. It just doesn't feel right.

With regard to the order of things, even if by chance the FROM mechanism would be accepted as the standard for coercion, it is rather unlikely that we gonna get rid of .Type method. Not any time soon after the acceptance. But even in this case .FROM will be called not if .Type fails but if there is no method .Type in the invocant. And the best part of it is that in many cases this information would be available at compile time allowing for optimizations. So cases like:

method foo(Type1(Type2) $bar) { ... }

are optimizable right away by Actions. A bit more tricky situation like:

sub foo(Type() $bar) { ... }
...
my Int $i;
foo($i);

should still be possible to optimize, but perhaps with spesh plugins.

In any case, nqp::can() is there one way or another, wether we have .Type method on Any (or any other class) or we don't.

So, if nothing substantial escapes my attention, .FROM must not result in performance degradation but would result in more concise code, more akin to that we have for smartmatching.

niner commented 4 years ago

From an OO design stand point .FROM has one major drawback: while every class has a published interface for creating objects, i.e. constructors, few classes have interfaces that expose their full state. Indeed, having so would violate the "tell, don't ask" design principle.

The result is that it's usually trivial to write a coercer in the source class. You just use your internal state to feed a constructor. The same is not true for the other direction, as you'd have to ask the other object for its state and if it's well behaved, it won't tell you. I wouldn't be surprised if this is the precise reason for why we have coercer methods on source objects.

The other thing we should keep in mind is, that before we design this around a fallback on method dispatch, we should demonstrate that we can actually do so without tanking performance. There's almost nothing as critical to our performance as method calls. Also we'd need to answer the question how this would interact with existing FALLBACK methods.

vrurg commented 4 years ago

It does look so that there is no good solution for the case. I was considering the problem of full object state availability and considered it less of an issue because it doesn't break the rules of object incapsulation. Whereas the current situation requires the source class to know about any other class it might be coerced into.

It looks to me like a code snippet from my previous comment is the way to combine both approaches. In more generalized pseudo-code it'd be:

Source.^can("Dest") ?? Source.Dest !! Dest.FROM(Source)

While I agree there could be a performance hit in it, it could be reduced to insignificant levels by compile-time and, probably, spesh optimizations.