jll63 / openmethods.d

Open multi-methods for the D language. OPEN! Multi is cool. Open is great.
44 stars 8 forks source link

Is it possible to make the library work with templates? #8

Open ghost opened 7 years ago

ghost commented 7 years ago

Consider the following snippet, which attempts to recreate (a better, extensible) to:

import openmethods;
mixin(registerMethods);

interface BaseWrapper { }

class Wrapper(T): BaseWrapper
{
    T value;
    this(T v) { value = v; }
}

T
to(T)(virtual!BaseWrapper);

@method
Wrapper!T
_to(T)(Wrapper!string w)
{
    import std.conv;
    return new Wrapper!T(std.conv.to!T(w.value));
}

void 
main()
{
    updateMethods;

    auto s = new Wrapper!string("12");
    assert(s.to!int == 12);
}

This doesn't compile because, it seems, the open methods mechanism doesn't kick in for to and the compiler doesn't see Wrapper!string as a possible stand-in for virtual!BaseWrapper. The error message is

function app.to!int.to (virtual!(BaseWrapper)) is not callable using argument types (Wrapper!string)

Is it possible to enable the library to work with templates? This would add a considerable layer of cool and useful to it.

ghost commented 7 years ago

After some experimentation I've figured out that a part of the problem is in the fact that in this case the template instantiates not in the module scope, so registerMethods doesn't see it. If we do alias intTo = to!int at the module scope and later use the alias in main, the program compiles (but dumps core later on with segmentation fault, so there's more to the problem than just scope).

jll63 commented 7 years ago

I am thinking this through. What you ask for, is often asked for virtual functions too. The answer is: it's impossible because the virtual table could not be known until link time. But open methods face the same problem and get around it (that's updateMethods). So there is a glimmer of hope. The problem - at compile time - is whether the template instantiations are visible in ModuleInfo. You seem to have established that the info is there, and found a workaround. I also guess that the same problem happens with templatized overrides. The compiler probably doesn't see any reason to instantiate them. Maybe the alias trick would work too. Do you get a message with the core dump? I suspect that you get a "call to ... is not implemented". I would like this to work (even if it requires some help from the user) but I am moving this weekend, so I am very busy these days...

ghost commented 7 years ago

No, I don't get a message with the core dump (and the stack trace as shown by journalctl is a __libc_start_main followed by a number of "n/a"s from somewhere in the main binary, even in debug build). I've rerun the test but with the library built in "xct" mode, here's the output:

Running ./test 
registering MethodInfo("toInt", [app.BaseWrapper], [], null, null, null, 55D0C82FA584, 55D0C82FA530)
Seeding...
  BaseWrapper
Scooping...
Layering...
  BaseWrapper
Allocating slots...
  BaseWrapper...
    for toInt(BaseWrapper)#0: allocate slot 0
    also in
Initializing the global mtbl vector...
  gmtbl size: 1
  slots:
    7F4A78159190 00-01 toInt(BaseWrapper)
  mtbl:
Building dispatch table for toInt(BaseWrapper)
  make groups for param #0, class BaseWrapper
    assign specs
  assign slots
    dim 0
Initializing global dispatch table - 0 words
  toInt(BaseWrapper):
    1-method, storing fp in mtbl, slot = 0
toInt | virtual!(BaseWrapper):null 0 | virtual!(BaseWrapper):Program exited with code -11
jll63 commented 7 years ago

I took a first look at this and it is not very promising. It seems that class template instantiations are not registered in ModuleInfo:

import std.stdio;

class Foo {}

class Bar(T) : Foo
{
  int i;
}

alias BarInt = Bar!int;

const barInt = new BarInt;

void main()
{
  foreach (mod; ModuleInfo) {
    foreach (c; mod.localClasses) {
      writeln(c);
    }
  }
}

Output:

modtemp.Foo
core.exception.RangeError
core.exception.AssertError
core.exception.FinalizeError
core.exception.HiddenFuncError
core.exception.OutOfMemoryError
core.exception.InvalidMemoryOperationError
core.exception.SwitchError
core.exception.UnicodeException
core.exception.SuppressTraceInfo
gc.impl.conservative.gc.ConservativeGC
rt.lifetime.ArrayAllocLengthLock
core.thread.ThreadException
core.thread.ThreadError
core.thread.Thread
core.thread.ThreadGroup
core.thread.Fiber
core.time.TimeException
gc.gcinterface.GC

I'll ask in the forum if someone has an idea...

jll63 commented 7 years ago

Example here: https://github.com/jll63/openmethods.d/blob/797f695417d6bdc1018f7bd96e54944f587eebd8/experimental/methodtemplates.d

ermo commented 3 years ago

@jll63 :

I had actually written out a long example, but noticed that you have a method-templates branch.

I'm quite smitten with the openmethods.d approach and would like to implement a matrix library based on it, but only if it conveniently supports templating in terms of matrix data types.

How far along are you? How can I help (if at all)?

jll63 commented 3 years ago

I'm quite smitten with the openmethods.d approach and would like to implement a matrix library based on it, but only if it conveniently supports templating in terms of matrix data types.

The Matrix examples scream "template" indeed. But supporting templates is very difficult for a library.

Extrapolating from the non-templatized example:

module matrix;

Matrix plus(virtual!Matrix, virtual!Matrix);

@method DenseMatrix _plus(DenseMatrix a, DenseMatrix b) { ... }

Ideally we want this to work:

module matrix;

mixin(registerMethods);

Matrix!T plus(T)(virtual!(Matrix!T), virtual!(Matrix!T));

@method DenseMatrix!T _plus(T)(DenseMatrix!T a, DenseMatrix!T b) { ... }

A compiler based approach would make this possible (not that it would be easy). For a library, it is much more difficult. There are three challenges.

Generating the dispatcher

Maybe you already know this, if you looked into how openmethods works. mixin(registerMethods) scans the module for method declarations, i.e. functions that have at least one virtual!T parameter. For each of them, it sort of creates two function that look like this in the (non-templatized) Matrix example:

Matrix plus(Matrix a, Matrix b) { return openmethods.Method!(matrix.plus).resolve(a, b)(a, b); }
openmethods.Method!(matrix.plus) plus(openmethods.MethodTag, Matrix, Matrix);

(Take this as pseudocode. In reality, the functions are aliases, Method has arguments that deal with overloading, etc)

The first function is used at runtime, it is the entry point in the method: it finds the correct specialization and calls it. The second function (which is used at compilation time; it does not have a body) is used to match method specializations with the right method declaration.

The machinery inside Method inspects the method declaration and extracts all the aspects of it (down to the attributes and storage class of the arguments). It then uses the information to synthesize the two plus functions. Now, D makes it possible to find out everything there is to know about a function (although it's not a very easy task), and, going through some rather horrid tricks, create new functions that are variation of the initial function-method declaration. Most prominently, note that the dispatcher function does not have the virtual! markers.

And here lies the first challenge for templatized methods: D does not make it possible to fully inspect a function template. Thus it it not possible - at least the last time I looked - for the library to generate this:

Matrix!T plus(Matrix!T a, Matrix!T b) { return openmethods.Method!(matrix.plus(T)).resolve(a, b)(a, b); }
openmethods.Method!(matrix.plus!T) plus(openmethods.MethodTag, Matrix!T, Matrix!T);

The alternative is to require the user to write the two functions by hand, and the challenge becomes, what can the library do to make it as terse and as robust as possible.

Instantiating the specialization

When the compiler sees code like this:

@method DenseMatrix!T _plus(T)(DenseMatrix!T a, DenseMatrix!T b) { ... }

It has no reason to instantiate _plus for any type. Note, however, that these is no such problem with the plus dispatcher itself: it will be instantiated with the appropriate types as the compiler encounters calls to the plus function.

If open methods were part of the language, it would probably involve a link-time step that would do something like this:

It is probably be possible to create a compile-link wrapper tool that would do this for openmethods but it's a lot of work, and the tool would need to work for a diversity of platforms. Then what about dynamic linking?

A much simpler approach is to require the user to explicitly instantiate the specializations, probably using a mixin helper:

mixin template RegisterMatrixOps(T)
{
    DenseMatrix!T plus(T)(DenseMatrix a, DenseMatrix b) { ... }
    DenseMatrix!T times(T)(DenseMatrix a, DenseMatrix b) { ... }
}

mixin RegisterMatrixOps!int;
mixin RegisterMatrixOps!double;

Registering the classes

openmethods finds all the classes involved in method dispatch using runtime introspection. That does not work for templates. The solutions are similar to above: a compile-link wrapper, or manual registration:


template declareMatrixClasses(T)
{
  mixin registerClasses!(Matrix!T);
  mixin registerClasses!(DenseMatrix!T);
  mixin registerClasses!(DiagonalMatrix!T);
}

mixin declareMatrixClasses!int;
mixin declareMatrixClasses!double;

So you see, it's challenging...On the other hand, the burden of manual registration would be on the author of the Matrix library. The users would not need to know any of this.

How far along are you? How can I help (if at all)?

For me, at the time being, D is a hobby I return to once in a while. Every time I do, D has evolved. Maybe at one point challenge #1 will become tractable. Probably it will involve better compile-time inspection for templates. A while ago I made a bit of noise about this in the forum.

As for the other two challenges, I think that most people could live with the manual instantiation approach.

Have you looked at the internals? Perhaps you can come up with ideas on challenge #1. I spent many days racking my brains,

ermo commented 3 years ago

Being fairly unfamiliar with the magic behind the scenes, the outcome I would naïvely like to see is to have an auto-generated per-class (= per file if we assume an idiom where each class is confined to its own file) set of machinery which can be called before specialisation use to declare and register the class specialisation(s) and the method specialisation(s) in one go.

So if the user needs both ushort and float matrices, they would only need to add (something like) the following at the call-site:

import Matrix;
import DenseMatrix;
import DiagonalMatrix;
import FooMatrix;
import BarMatrix;

import openmethods;
mixin(registerMethods);

auto classes = ["Matrix", "DenseMatrix", "DiagonalMatrix", "FooMatrix", "BarMatrix"];
foreach(cls; classes)
{
    declareSpecialisation!ushort(cls);
    declareSpecialisation!float(cls);
}

The goal here would be to be able to use both compile-time and run-time introspection to construct (or marshal / compose) e.g. "declare" ~ ClassName ~ "Specialisation" and "declare" ~ ClassName ~ "Methods" machinery (= similar to the template/mixin template that you outlined above) that both declare and register the relevant class specialisations as well as method specialisations in one pass; that would, I think, be an acceptable trade-off for specialisation users in terms of complexity of manual registration/declaration?

So to re-use your Matrix class example, for each type of Matrix class I would need to write/specify two templates which each have a pre-defined naming structure mandated by openmethods.d. Using the DenseMatrix class definition file as an example, I would simply declare the specialisations of the class and the methods as follows (think of it in pseudocode terms, not as a concrete implementation example):

template declareClassSpecialisation(T) {
    mixin registerClasses!(DenseMatrix!T);
}

template declareMethodSpecialisation(T)
{
    DenseMatrix!T plus(T)(DenseMatrix!T a, DenseMatrix!T b) { ... }
    DenseMatrix!T times(T)(DenseMatrix!T a, DenseMatrix!T b) { ... }
}

The openmethods.d machinery would then look in each imported file for those two templates names and just do the right thing if they are present.

This would allow the declarations to stay local to each file that makes use of them, thus obeying both the principle of least surprise and the principle of locality?

Is this line of thinking at all fruitful?

jll63 commented 3 years ago

Ugh. I typed a lengthy reply to your comment, then, as I was cleaning it up, Vivaldi crashed while trying to load suggestions for a typo. I don't have the spirit to re-type everything from scratch right now...I think that the major point was my reply to:

So to re-use your Matrix class example, for each type of Matrix class I would need to write/specify two templates which each have a pre-defined naming structure mandated by openmethods.d. Using the DenseMatrix class definition file as an example, I would simply declare the specialisations of the class and the methods as follows (think of it in pseudocode terms, not as a concrete implementation example):

template declareClassSpecialisation(T) {
    mixin registerClasses!(DenseMatrix!T);
}

template declareMethodSpecialisation(T)
{
    DenseMatrix!T plus(T)(DenseMatrix!T a, DenseMatrix!T b) { ... }
    DenseMatrix!T times(T)(DenseMatrix!T a, DenseMatrix!T b) { ... }
}

The openmethods.d machinery would then look in each imported file for those two templates names and just do the right thing if they are present.

But how would the machinery find out which Ts to instantiate the templates for?

In conclusion, I added that there were reasonable solutions to problem 2 (instantiating and registering the method specialisations), but problem 1 (nice syntax for templatized method declarations) was a tougher problem, given the state of D (last time I looked).

ermo commented 3 years ago

But how would the machinery find out which Ts to instantiate the templates for?

Am I misunderstanding that this information ought to be supplied by the user?

Are you suggesting that the library writer has to have the foresight to specify which instantiations the user might want to make (re. problem 1: nice syntax for templatized method declarations)? As in: To support a specific (and limited) set of instantiations only? Wouldn't that defeat the point of generics/template metaprogramming in the first place?

ermo commented 3 years ago

So, actionable suggestion:

Would it make sense to you if I suggested that you try to add templates to your matrix examples where the first goal is to simply get something working? Then the next step can be to work on how to better organise it and make it prettier/more elegant to use? Having a base to work from (even if it's ugly) is probably more fruitful than having theoretical discussions about it...

I have dabbled in adding templates locally, but it feels like I'm hitting a dead end; mostly it's a knowledge/skill gap I guess -- hence why I'm trying to gently poke you. ^^'

jll63 commented 3 years ago

But how would the machinery find out which Ts to instantiate the templates for?

Am I misunderstanding that this information ought to be supplied by the user?

Unless method templates are supported by the language or the toolchain, yes.

Are you suggesting that the library writer has to have the foresight to specify which instantiations the user might want to make (re. problem 1: nice syntax for templatized method declarations)? As in: To support a specific (and limited) set of instantiations only? Wouldn't that defeat the point of generics/template metaprogramming in the first place?

I think that in general, that is not desirable. In the specific case of matrices, one could argue that Matrix!double, and just that, will be what's needed in most cases. It could come pre-instantiated with the library. But, what if the user only needs Matrix!int or Matrix!(Complex!double)? He would be saddled with code for Matrix!double) that he would not use.

Things get even worse when you consider that, in all likelihood, a flexible matrix library would support mixed binary operators:

Matrix!(typeof(T1.init + T2.init)) plus(T1, T2)(const T1 a, const T2 b)

Now we have to deal with all the combinations of two types :-|

So to sum up, when it comes to template instantiations, there is a spectrum of possible implementations, from least desirable to most:

  1. User and/or library writer instantiates.
  2. Library comes with tools that scan the entire program and generate the D code to instantiate.
  3. Method templates are integrated in the language, and the standard toolchain instantiates.

It is possible to work on this incrementally. First implement (1) with the best and simplest API for manual instantiation. Then proceed to (2) - I have a couple of ideas on how to do this, again within a spectrum between easy to implement to easy to use. Then, (3), hack dmd and try to convince the community to adopt methods in the language - good luck with that ;-)

jll63 commented 3 years ago

Would it make sense to you if I suggested that you try to add templates to your matrix examples where the first goal is to simply get something working?

That's this branch.

Then the next step can be to work on how to better organise it and make it prettier/more elegant to use? Having a base to work from (even if it's ugly) is probably more fruitful than having theoretical discussions about it...

Yes, that's the current bit to do. Alas, I am very pessimistic about the declaration syntax. Although I have thought of a couple of approaches.

I am working on this subject in the C++ version too. There, crazy C++ template syntax led me to toy with making some of the internals public. Clean up the Method class and provide a clean API for adding functions.

I have dabbled in adding templates locally, but it feels like I'm hitting a dead end; mostly it's a knowledge/skill gap I guess -- hence why I'm trying to gently poke you. ^^'

Sure. I would love to see this work in D and in C++. I have not done any D programming for months, unfortunately D is more of a hobby for me. Each time I want to do serious work I have to re-learn parts of the language - and re-discover some of the gaps and shortcomings.

Have you noticed the explain and xtc build modes? If you run with dub --build xtc ..., it will spit out a lot of information about what's going on inside.

jll63 commented 3 years ago
  1. Library comes with tools that scan the entire program and generate the D code to instantiate.

For example, updateMethods could list all the missing specializations, and a small tool could pick up the output and generate the calls to the library that the user would need to write (1). That would be very easy to implement, and it would also be useful as a debugging aid even if templates are not used.