josdejong / typed-function

Runtime type-checking for JavaScript functions

MIT License

71 stars 19 forks source link

Allow "recursion" calls to a specific signature more easily/naturally? #126

Closed gwhitney closed 2 years ago

gwhitney commented 2 years ago

A potentially "better" rendering of the example under "Recursion" in the readme would be:

const sqrt = typed({
  number: function (value) {
    return Math.sqrt(value);
  },
  string: function (value) {
    // on the following line we self reference the typed-function using "this"
    return this.signatures.number(parseInt(value, 10));
  }
});

console.log(sqrt('9')); // output: 3

I say this because this fairly mild recasting of the "recursion" call avoids the overhead of a second typing and dispatch step to reach the desired implementation. For a typed-function with many signatures, that cost can be non-trivial, and the fact is that most of the time in one of these recursions you know exactly which other signature you intend the recursive call to go to. (Also note that in some cases in mathjs, for example, it can take up to at least two rounds of recursing before you reach the "primary" implementation, so the overhead can pile up.)

So could there be a mechanism to make such dispatch-short-circuiting recursion more facile so it becomes the dominant way to recurse? One thought is to place each signature as a key on the resulting object, so that in the above you could write:

  string: function (value) {
    // on the following line we self reference the typed-function using "this"
    return this.number(parseInt(value, 10));
  }

(Note with a longer signature you'd have to write this['int, string'](7, 'foo') for example.)

Going further, is it worth making an effort to de-facilitate the more expensive full-redispatch recursion? Would there be any way to disable the direct this(other, args) and force the client to write this.redispatch(other, args) as a reminder to the author of client code that such a call will go through the whole type matching system again?

josdejong commented 2 years ago

Thanks, good point about the overhead of running through the argument type checking again. It's indeed much better to address the right signature directly if you already know the type of the arguments.

I do like you're idea of wanting to make it easier to reference to specific signatures and make that the "normal" way to do recursion.

Some thoughts based on your inputs:

About exposing the signatures directly on the function itself this.* instead of this.signatures.*: it sounds convenient, but I'm afraid of the potential conflicts that can arise between built-in methods/properties that function has (and may get in the future), and a typed-function signature. It is quite theoretical that this would bite each other, but in general I try to avoid mixing (dynamic) sets of methods/properties from different abstraction levels together. How do you think about that?
The "generic" this(...) definitely has valid use cases (see for example here: https://github.com/josdejong/mathjs/blob/develop/src/function/arithmetic/add.js#L122-L131), so I think at least we should not remove it. It is technically possible to remove the support for recursive calls ore move this into a separate method like this.redispatch(...). I do have to say that I do like having this being exactly the same as what you get returned from const myFun = typed(...), it's a really simple model that is easy to understand. But yeah, it's also not performance optimal. It's this balance between having easy to read code vs optimized code. I have to give that a bit more thought, I'm a bit hesitant of making this current recursion feature this(...) harder to use 🤔
It is without doubt a good idea to better explain about the performance trade offs, and better highlight using this.signatures for better performing code. How about showing two examples in the Recursion section to clearly educate the user about the differences in readability vs performance?

gwhitney commented 2 years ago

Just to continue brainstorming, here are some other options:

To ease calling with a specific signature, we could provide this.route('string', 'number')('high, 5') or if you like the syntax better this.route('string,number', 'high', 5)
Another alternative is to use a Proxy to support this[['string', 'number']]('high', 5) -- a notation that it seems very unlikely would ever get a conflicting standard meaning.
To mark it as the less desirable option, we could provide this.redispatch(other, args) which actually just does the same thing as this(other, args) but we only "advertise" the former, not the latter, in the documentation.

If you like any of those I will be happy to provide a PR or if not and there isn't anything else you would like to do on this I will just submit a documentation-only PR that recommends when possible directly calling the desired instance rather than doing a full redispatch, as you suggest.

josdejong commented 2 years ago

Thanks, yeah these are interesting ideas. I have to give it a bit more thought. Maybe I just need to get used to the idea of not having this(...).

gwhitney commented 2 years ago

Sure, happy to wait until you've thought over what direction you want to go with this (including leaving things just as they are and closing this issue, if that's what seems best). I will just say that as far as I can see, there's no solution that avoids this altogether because as far as I can tell, this is the only dynamically-scoped identifier in a JavaScript function. (Except of course if we added an extra argument, say 'call', to every implementation of every typed function just for the sake of reroute/redispatch, so that one could write

  'number,number': (call, x, y) => x*y + x + y,
  'number,string':(call, x, y) => call('number', 'number')(x, parseFloat(y)),
  'string, any': (call, x, y) => call.redispatch(parseFloat(x), y)
})

But that seems unnecessarily cumbersome and would require touching literally every typed function in mathjs, so unacceptable.)

josdejong commented 2 years ago

I've been experimenting a bit with alternative approaches to this. If we remove self referencing via this, we should open up using this like in regular JavaScript functions: by default, functions are unbound, and this references to the context from which they are executed or where they are bound afterwards. That would open up interesting options again in mathjs to utilize this like have function math.sum reference math.add dynamically as this.add (but that's currently a vague idea and very experimental).

I managed to make typed functions unbound, and implement self-referencing in two different ways:

Option A: replace `this` with `typed.self`

Working proof of concept: https://github.com/josdejong/typed-function/pull/136

Usage example:

var sqrt = typed({
  'number': function (value) {
    return Math.sqrt(value);
  },
  'string': function (value) {
    // on the following line we self reference the typed-function using "typed.self"
    return typed.self(parseInt(value, 10));
  }

Pros/cons:

Pro: implementation and usage is very simple
Con: the implementation is "dirty": uses a global variable which is overwritten with every function invocation, and this variable could be accessed from outside which is odd.
Con: the implementation causes a little bit of overhead when invoking the function (could be negligible, we have to benchmark)

Option B: replace `this` with `typed.reference(self => {...})`

Working proof of concept: https://github.com/josdejong/typed-function/pull/137

Usage example:

var sqrt = typed({
  'number': function (value) {
    return Math.sqrt(value);
  },
  'string': typed.reference(self => {
    return function (value) {
      // on the following line we self reference the typed-function
      return self(parseInt(value, 10));
    }
  })
});

Pros/cons:

Pro: this is a neat implementation without dirty side effects
Pro: no overhead when executing functions, only some overhead when constructing a typed function
Con: usage is relatively complex
Con: implementation is relatively complex

Other thoughts

With both approaches, I think we can replace self with one or multiple util functions (or offer them all), where we can offer a function like @gwhitney proposes: route(signature, arguments) and dispatch(arguments), or maybe better resolve(signature)(arguments) and dispatch(arguments). So Option B could be changed to something like:

var sqrt = typed({
  'number': function (value) {
    return Math.sqrt(value);
  },
  'string': typed.reference((resolve, dispatch) => {
    return function (value) {
      // on the following line we self reference the typed-function
      return dispatch(parseInt(value, 10));

      // more efficient:
      return resolve('string')(parseInt(value, 10));
    }
  })
});

A caveat is that the signatures can only be resolved dynamically, and not during construction of the typed-function, as these "reference" callback function signatures aren't yet resolved. Chicken-egg problem.

gwhitney commented 2 years ago

Orthogonally to either option (A) or option (B), we could promote the following pattern:

const rootFns = {
  number: x => Math.sqrt(x),
  string: s => rootFns.number(parseFloat(s)),
  'number,number': (r,x) => Math.pow(x, 1/r),
  'number,string': (r,s) => rootFns['number,number'](r, parseFloat(s))
}
const root = typed('root', rootFns)

It seems to me like we could just always use this pattern except when full redispatch on the new arguments is wanted, for which typed would need to provide some (preferably heavy for the sake of discouraging its use) mechanism, whichever mechanism you like best -- the current one or one of these alternates or something else.

Am I missing something? This approach seems clean and without drawbacks to me...

gwhitney commented 2 years ago

So if you like this approach and if you are fine with the existing means of redispatch for now, then all that would be indicated at this time would be a documentation change plus an initiative that as mathjs typed functions are touched, they be refactored to the suggested pattern where possible.

gwhitney commented 2 years ago

It occurs to me that in the pattern I posted earlier today ("pattern one"), the rootFns['number,number'] dereference happens on every call, when conceptually it could/should happen just at creation time. Not sure if that makes any appreciable difference, but one could avoid it with something like

const rootFns = {
  number: x => Math.sqrt(x),
  'number,number': (r,x) => Math.pow(x, 1/r)
}
rootFns.string: (rootNumber => s => rootNumber(parseFloat(s)))(rootFns.number)
rootFns['number,string'] = (rootNN => (r,s) => rootNN(r, parseFloat(s)))(rootFns['number,number'])
const root = typed(rootFns)

but now the syntax is getting a bit baroque. I guess option B is roughly an automated version of this scheme, but I think your caveat 2 above says that because of that automation, the dereference of the signature would still happen on each execution -- so it seems to me we might as well go with the much more readable "pattern one" as it's therefore no worse than option (B).

Alternatively,

const rootNumber = x => Math.sqrt(x)
const rootNN = (r,x) => Math.pow(x, 1/r)
const root = typed({
  number: rootNumber,
  string: s => rootNumber(parseFloat(s)),
  'number,number': rootNN,
  'number,string': (r, s) => rootNN(r, parseFloat(s))
})

This has the benefit of minimizing execution-time de-reference, with greater readability, and without using option A or option B, just at the cost of being slightly more verbose. On the other hand, in general it might be fairly reasonable to define one's "base" implementations before the call to typed, and then only have the "derived" implementations defined in-line in the typed() call. So I think of all the proposals here for calls to specific signatures, I like this one the best, all things considered.

gwhitney commented 2 years ago

Then in terms of doing full redispatch, what about just:

const rootNumber = x => Math.sqrt(x)
const rootNN = (r,x) => Math.pow(x, 1/r)
const root = typed({
  number: rootNumber,
  string: s => rootNumber(parseFloat(s)),
  'number,number': rootNN,
  'number,string': (r, s) => rootNN(r, parseFloat(s)),
  Array: a => root(a[0], a[1])  // full re-dispatch to handle all types of entries
})

The only drawback I can think of to this approach is that if one accumulates this with more signatures for root, the Array implementation is still pointing to the original typed-function in the original assignment to root, and so the resulting combined function can't dispatch to any of the new signatures. I think that is a real problem for the way mathJS works. But I think that could be solved if rather than the accumulation of new signatures into a new typed-function object, there's a method on typed-functions that mutates them to have the new signatures, e.g.

root.addSignatures({
  boolean: b => Number(b),
  'number,boolean': (r,b) => Number(b)
})

Then I think since root is the same object, just with more behavior, the prior Array implementation could dispatch to the new number,boolean implementation. One might worry about changing the function body of a Function object, but it seems simple enough to me to get around that by having the function always be created inside the create() function via:

function fn () {
  fn.dispatch(arguments)
}
// now define dispatch and other properties
...

and then the mutating function to add new signatures can reassign the dispatch. I think this all works, and would provide an option (C) for "unbound" typed-functions that's lighter weight and seems to have few drawbacks. Let me know if you're interested in a proof-of-concept implementation.

josdejong commented 2 years ago

This is really interesting stuff to think through. I love it.

Some thoughts and feedbacks:

1.

Really good point about the "best" and simplest approach of first defining your low-level functions and then "cheap" referencing them in JS, like

const rootNumber = x => Math.sqrt(x)
const rootNN = (r,x) => Math.pow(x, 1/r)
const root = typed({
  // ... reference rootNumber and rootNN here, not root
})

We could better promote/explain that pattern in the docs and examples probably.

2.

It occurs to me that in the pattern I posted earlier today ("pattern one"), the rootFns['number,number'] dereference happens on every call, when conceptually it could/should happen just at creation time

This is indeed the essence I think, to optimize this for runtime. So indeed ideally, you want as little (runtime) dispatching/routing/matching as possible.

3.

About "option C" and the full redispatch by referencing the outer defined function like:

const root = typed({
  // ... signatures
  Array: a => root(a[0], a[1])  // full re-dispatch to handle all types of entries
})

The drawback you mention is indeed a serious one. We had this structure originally in mathjs, before self referencing via this was implemented. I think we should not use this pattern of referencing a typed-function itself via a global/outside defined variable, that gives nasty edge cases. Like indeed not being able to merge the typed function with another one because it will still reference it's old version. A mutable root.addSignatures could help in that regard, but still does not feel ideal. I really like the immutable approach it has now, and the flexibility of "just" merging any typed-functions together and do whatever you want with the map with signatures, you have all freedom. That is only possible when the individual signatures are "standalone".

So, I'm really convinced we do need a local way to reference the function and its signatures itself. But we should try to minimize the need for it by offering good alternatives.

4.

Your inputs gave me some more ideas and insights 😁👍

So far I have the feeling that we should find a solution into the direction of Option B, and not go for a solution that relies on globals and tricks, even if that requires a bit more verbose API. Other solutions will come back to bite us. I'm going to try to work out Option B a bit more. I'm also curious to hear if you think the direction of Option B could work out nicely, or whether it's too complex for it's own good.

josdejong commented 2 years ago

I've worked out Option B a bit further. It looks very promising to me. The API can become:

var sqrt = typed({
  'number': function (value) {
    return Math.sqrt(value);
  },
  'string': typed.reference(function (resolve, self) {
    // resolve a specific signature at creation time (most optimal for runtime performance)
    const sqrtNumber = resolve('number')

    return function (value) {
      // use the signature
      // alternatively, you can also call self(parseInt(value, 10)) if you need a full dispatch
      return sqrtNumber(parseInt(value, 10));
    }
  })
});

// use the typed function
console.log(sqrt('9')); // output: 3

Cons:

The API is relatively verbose. I think it is worth it: the solution is really powerful and flexible.
The implementation is relatively complex

Pros:

There are no "dirty" tricks needed
This solution keeps the typed-functions function unbound, allowing you to use this and bind it yourself like in a regular JS function
The API typed.reference(function (resolve, self) {...}) promotes using resolve over self. You very clearly see the existence of resolve, and self is the second argument of the callback which indicates that it is less important.
But, it does allow you to self-reference self whenever (really) needed, without relying on global references of any sort. That is really powerful
This solution allows creation time resolving of signatures, which optimizes runtime execution
We can align the API of the resolve function with the static function typed.resolve that is being implemented in #135. An API unified like this is very powerful and easier to use. We can end up with the following two APIs:
- resolve(signatureStr | argsList) in the typed.reference callback function
- typed.resolve(fn, signatureStr | argsList) in the static function
We can let the resolve() callback and the static typed.resolve() function both return this signature object introduced in #135, which allows you to pick either signature.fn (original function) or signature.implementation (does do conversions for you). That gives a lot of freedom.

gwhitney commented 2 years ago

First my thoughts in response to the four big points in https://github.com/josdejong/typed-function/issues/126#issuecomment-1076214690

1) We are clearly both in agreement that static reference is best in those cases where one does not need/want to be able to pick up a new definition of a previously existing signature in a possible future merge (or mutation if it is allowed) of the typed function.

2) If I understand in option B, the dereference of a specific existing signature still happens on each execution. I think even that could actually be avoided in something along those lines, at the cost of yet a slightly more complicated syntax, a snippet of which would look like:

[ ... within signatures object for a typed call defining sqrt... ]
  string: typed.referencesSignature('number', sqrtNumber =>  s => {
    return sqrtNumber(parseFloat(s))
  }
[...]

I guess the idea here would be that if an implementation referenced two or more other ones (which is rare), it could just list all the signatures it wanted before the callback and then the callback should take that many arguments:

  string: typed.referencesSignature('number', 'boolean', (sqrtNumber, sqrtBoolean) => s => {
     [...whatever you want to do to decide whether to call sqrtNumber or sqrtBoolean...]
  }

Then every time the signatures are rearranged/added to and re-resolved, the implementations get the fully resolved referenced methods compiled right in and don't have to re-look-up in a hash on each call...

3) I of course have to defer to your experience vis-a-vis typed functions referring to themselves via lexical variables (note that they are definitely not global, their values are encapsulated at object creation time, so in fact I already think of "option C" as very local). But I think regardless of the mechanism used to refer to the typed function from its implementations (I happen to think either option B or C could work, with C being "lighter weight", but of course I have not been burned by whatever it was that happened with mathjs in the past), it would actually be the case that moving (back??) to a stance in which the typed-function objects created/used/exported by mathjs are constant in the javascript sense, i.e. that they retain their === object identity throughout the lifetime of a mathjs instance, and then are mutated under the control of the typed-function implementation whenever there is a need to add signatures, types, etc., solves more problems than it creates. In particular, it means that you could import a bunch of mathjs typed-functions and add signatures to them and then all the places that ever used them before or after would have access to the new signatures -- you wouldn't have to worry if someplace had kept a copy of one of the functions, or about the order of dependency injections, or anything, because all of the occurrences of say math.add would be the exact same object, and that object would get more new behavior as types/signatures were added.

In particular, this would make it very feasible to have a "bigNumbers.js" module (or group of modules) that would systematically import all relevant mathjs functions from the "core" (which would perhaps only handle regular number entities, say, and a few other types like strings) and add BigNumber signatures to them as appropriate. Then functions like 'sum' that just rely on 'add' to work should Just Work (tm). And then it seems like it would be easy to have a pared-down version then: just never load the bigNumbers module.

And just in general, the stance that any given method like math.add is constant as an object just makes it easier to not have to worry about keeping references to it, or did I extend it before I created math.sum, or whatever may be the case. It seems like one is in for many fewer headaches this way.

4) I totally agree with avoiding globals and tricks, and in particular option A. As I said, I think something like option B or option C could totally work, I think they differ mainly in cumbersomeness (but I understand you have experience that advises against option C).

gwhitney commented 2 years ago

P.S. on my proposal in part 2 about "referencesSignature" -- in this scheme, we could just use the signature '...' as the code to send back the full self, since that is what it amounts to, and note in the document that its use is much more expensive than giving a specific signature since it does full redispatch. Then there would not need to be a separate 'self' passed back.

gwhitney commented 2 years ago

Now on to your later comment https://github.com/josdejong/typed-function/issues/126#issuecomment-1076390092:

Ah I see we are thinking along the same lines. You now have the example calling resolve before generating the function, I just baked the resolve call into the typed.referenceSignature(...) call. I think this is a matter of what syntax you like better, and yes we should do one or the other or some similar mechanism that allows full resolution before compiling the method.
I think there is a difference between the "internal" resolve in your plan B outline and the "external" resolve, in that I think the internal one should only ever resolve to explicitly defined signatures and their specified functions, not additional signatures and their compiled implementations like the external one can. When you are writing an implementation, you want to know exactly what you are calling, not some generated one, and you want to prevent unexpected conversions from happening. So I am not a fan of your last point, letting the resolve callback pass back implementations. I think it should be as simple as possible to use and just give you the explicitly defined function to call, not some structure. We want to ease direct access to other implementations as opposed to full re-dispatch, not make it more complicated.
Finally, it occurred to me that there is a convenience intermediate between specific-signature indexing and full redispatch in terms of amount of time taken. Namely, there is a "natural JS type" of an entity given by:
```
function naturalType (entity) {
const t = typeof entity
if (t !== 'object') return t
if (entity === null) return 'null'
if (entity.constructor && entity.constructor.name) return entity.constructor.name
return 'object'
}
```
(The built-in type system is currently perfectly aligned with this except that 'Function' would have to be renamed 'function'.) So we could have a "private method" on typed functions or a callback you can obtain in the option B-style reference scheme or whatever that allows you to do f.bynatural(a,b,c) and rather than looping through all signatures and testing them, it computes the natural types of a, b, c and creates a signature from them and then looks up just that one signature (which should be explicit, not generated) and calls it. This method should definitely not be exposed outside of implementations, and it would only work for entities where the desired types are the ones that are produced naturally, but would be super convenient to use when an implementation knows that things line up correctly, and would be much faster than full dispatch (while of course still slower than fully-precomputed creation-time resolution, albeit also a bit more flexible). Anyhow, this is just a thought for an additional convenience for implementations.

gwhitney commented 2 years ago

Heck, since you're scanning the whole set of signatures anyway, I just wanted to point out the syntax doesn't have to involve a function call. It could be something like

const root = typed({
  number : n => Math.sqrt(n),
  string: { uses: 'number', does: rootNumber => s => {
    return rootNumber(parseFloat(s))
  },
  Array: { uses: ['number', 'string'],
     does: (rootNumber, rootString) => A => {
        ... implementation here that can call either of the other two...
    }
  }
})

Whatever you think will be easiest/clearest. An advantage if the infrastructure knows the signatures used is that if a signature is replaced, it knows just which implementations to re-resolve, rather than having to do them all.

gwhitney commented 2 years ago

And I thought I would just say that the version that uses arrays as the property values when the function wants to call another signature or signatures actually doesn't look too bad, either:

const root = typed({
  number : n => Math.sqrt(n),
  string: ['number', rootNumber => s => {
    return rootNumber(parseFloat(s))
  }],
  Array: ['number', 'string', (rootNumber, rootString) => A => {
      ... implementation here that can call either of the other two...
  }]
})

The nicest thing about this organization is that it puts the signature(s) that need to be referenced right next to the formal parameter name that will receive the implementation for that signature. So they correspond positionally and visually very nicely. And the infrastructure still gets to know exactly what signatures are referenced by what. And we could still use either '...' or reserve a keyword like 'redispatch' or 'self' so the function can signal that it wants to be able to do full redispatch (which I presume will be rare).

Of the various option B variants, this might seem the least cumbersome, I think... If you feel that a function-call syntax is definitely better, I'd definitely vote for one where you pass signatures and a callback that takes function(s) corresponding to those signatures and returns a function that computes the correct value for the corresponding arguments. (Rather than the latest draft where you get back a single resolve function and have to call it... I think that will just lead to lots of boilerplate of

...
  'type1,type2': typed.reference(resolve => {
     const funcUV = resolve('typeU,typeV')
     return function (t1, t2) {
        ... do stuff using funcUV and t1 and t2
     }
   })
...

If you're always just going to call resolve on a signature and use the function you get to define the implementing function, why not abbreviate it to:

...
   'type1,type1': ['typeU,typeV', funcUV => function (t1,t2) {
        ... do stuff using funcUV and t1 and t2
    }],
...

or in function-call syntax

...
   'type1,type1': typed.referTo('typeU,typeV', funcUV => function (t1,t2) {
        ... do stuff using funcUV and t1 and t2
    }),
...

Anyhow, looking forward to the next thought this takes us to (and to converging on something good!).

josdejong commented 2 years ago

Am I correct to summarize all your responses as "Option B goes in the right direction, with creation-time resolving of signatures and a way to dispatch self, let's think through a neat function/object/array based API for it"?

(The numbering refers to the same numbering I used before, it's not in numerical order)

3. referencing outside defined variable "self"

So just to be sure we're on the same page regarding Option C, I understand it as referencing the variable root which is defined outside of the scope of the inside individual signatures:

const root = typed({
  // ... signatures
  'Array': () => {
    // use the outside defined `root` here
    return root(...)
  } 
})

It's indeed not "globally" defined, what I meant to say is it's defined outside of the scope of the signature itself. Sorry for the misleading way of expressing it.

In particular, this would make it very feasible to have a "bigNumbers.js" module (or group of modules) that would systematically import all relevant mathjs functions from the "core" (which would perhaps only handle regular number entities, say, and a few other types like strings) and add BigNumber signatures to them as appropriate. Then functions like 'sum' that just rely on 'add' to work should Just Work (tm). And then it seems like it would be easy to have a pared-down version then: just never load the bigNumbers module.

This is indeed what we definitely need! I think though that in both approaches this will "just" work, including referencing other functions like math.add, it's only a different way of implementing it:

In Option B you would replace: math.sum = sumNumber; math.sum = typed.merge(math.sum, sumBigNumber)
In Option C you would extend: math.sum = sumNumber; math.sum.addSignatures(sumBigNumber)

So from a functional point of view both approaches are OK and they both have there pros and cons. I guess it boils down to personal preference. I do have quite a strong preference in this regard to not rely on outside defined variables, and not append to existing typed-functions. Instead, I prefer to create new instances and take a pure, functional, immutable approach.

2. thinking through the API

If I understand in option B, the dereference of a specific existing signature still happens on each execution.

No, in Option B the deferencing only takes place at creation time, and also works when merging two typed-functions (when merging, the callback and resolve will be invoked again, picking the new version of the signature).

I do like you're idea of removing the intermediate step of having to call resolve yourself with an API like typed.referencesSignature('number', sqrtNumber => {...}). This is less cumbersome. On the other hand, it may become harder to read when you have multiple references each with multiple arguments. You would need a typed.referenceSelf(self => {...}) function too, but that is just fine I think.

we could just use the signature '...' as the code to send back the full self

Yes that would be possible. I do have a preference though for keeping the API explicit, like having a separate function typed.referenceSelf for it.

I just wanted to point out the syntax doesn't have to involve a function call.

Yes you're right, when implementing it in this Experimental PR for Option B I noticed too that it is just a wrapper around creating an object :). At first I thought it would be most neat to expose this as a function, but I do like you're proposals of an API based on an object or array too. It looks straightforward and removes the intermediate resolve step.

Your array proposal reminds me of the (now deprecated) AMD module system, it's exactly the same :)

Array: ['number', 'string', (rootNumber, rootString) => A => { ... }

The experience I had with that is that it works nice as long as you have only a few dependencies, but it becomes cumbersome to maintain when you have a long list: you have to keep the two lists in sync. For signatures with multiple it may become hard to read (you use commas both inside the signatures as well as to separate signatures). I think that in the case of typed-function you will typically reference only one or maybe two signatures so that problem may not be big. I'm a bit hesitant about it.

To get concrete, I think the three nicest options that I see so far are:

2.1 typed.reference(function (resolve, self) { const fn1 = resolve(signature1) ; ... })
- pros: very generic and powerful, scales well, explicit and clear
- cons: is complex, is verbose especially for simple use cases, requires an intermediate call to resolve(...)
2.2 typed.referenceSignature(signature1, fn1 => { ... }) and typed.referenceSelf(self => { ... }), with the limitation that we (on purpose) only allow resolving 1 signature with this API.
- pros: very simple and straightforward
- cons: limited, becomes verbose when needing multiple signatures (nesting)
2.3 [signature1, signature2, (fn1, fn2) => { ... }] (or the function variant)
- pros: very clean and compact API
- cons: this API may be hard to understand too, and can become hard to read/maintain when having multiple signatures and more complex signatures

Thinking aloud here. The simplest API is not necessarily the one that needs the least number of characters. It has to do with the least cognitive overhead and being easy to read and remember. I personally have a preference for explicit API's. For example the the array API is very concise, but doesn't communicate anything to the reader: the API consists of a couple of square brackets and comma's only. When a stranger looks at the code, he doesn't get a clue about what is going on. Having a stranger look at an explicit function call typed.referenceSignature(...) will help a lot in understanding what this code is doing.

I have to give this some more thought. At this moment I have a slight preference for option 2.2: I think in practice you do not need "many" signatures, and this API is just so simple, explicit and straightforward.

5 additional signatures or only original signatures

I think the internal one should only ever resolve to explicitly defined signatures and their specified functions, not additional signatures and their compiled implementations

That is a good point, agree. Let's only expose original signatures the "internal" API. :thumbsup:

6. natural JS type

So we could have a "private method" on typed functions or a callback you can obtain in the option B-style reference scheme or whatever that allows you to do f.bynatural(a,b,c) and rather than looping through all signatures and testing them

I have the feeling that you're onto something interesting but I don't fully understand you. You have an idea around being able to utilize knowledge about the types to create more optimized functions beforehand. Can you elaborate or maybe give an example? (or discuss this in a separate topic to keep this focused 😅)

gwhitney commented 2 years ago

Yes, I am fine with Option B and agree that we should take a moment here to choose the best syntax for a typed function to use it, since it is wide open.

Going through your points in the non-numerical order you present them,

3) I think we are on the same page here except that I don't think there is any linkage in the other direction from avoiding "outside" variables to whether typed-function instances can/should be mutable or not, and I think there are very strong reasons in favor of mutability -- but I have written much more about that in #138. Since it seems typed-function is going Option B, the im/muatbility question is therefore orthogonal and can be decided separately, hence the separate issue, which I hope you will give serious consideration.

2) On the api: I am fine with a functional one, and it might make swapping mechanisms in the future easier if a better implementation idea comes up, so it may be a bit more "future-proof". So if typed-function goes with the option you label '2.2', I'd suggest: (a) the name be slightly shorter, like typed.refersTo or typed.uses or something, and (b) that the function allow multiple signature arguments and a final callback argument that takes that many function parameters, e.g.

  Array: typed.refersTo('string', 'number,number', (rootString, rootNN) => function (array) {
      ... implementation goes here ,,
  })

I think it would be very rare to refer to more than a couple other implementations (one will be the most common), so I do not think this notation will be cumbersome, but on the occasions where two signatures are necessary it will be very nice not to have nest calls to typed.refersTo or something like that. I am agnostic as to whether there is a dedicated "signature" for getting back "self" (although I think string: typed.refersTo('self', self => function (s) { ... } reads quite well) or whether it's a separate method on typed that gets self, since i think in practice it will be used very rarely.

6) On natural JS type, I think that's part of this thread because it is an idea directly about how to call other signatures of oneself. A bit more explanation: full redispatch is expensive mostly because of running all of the test functions, and secondarily because of the indirecation. The "typed.refersTo" mechanism we are discussing here eliminates both of those expenses in a recursive call by selecting (a) specific signature(s) at creation time. But actually the larger part of the cost can be eliminated just by knowing the "right" signature to use from the arguments directly without having to run all the tests; then it's just the indirection cost. I think the "natural" type of an entity is clear: strings are 'string', functions are 'function', booleans are 'boolean', null is 'null, and class objects are their most extended class, so a matrix would be 'DenseMatrix' or 'SparseMatrix' depending. This last covers Array, Date, RegExp, etc., because as of ES6, all are classes. truly raw objects end up as 'Object', I think, and then weird things that have no constructor at all end up as 'object'. So the idea is that if you know the type system in your typed instance is consistent with natural types, you can dispatch by computing the natural types of all of the arguments, checking if the signature consisting of that list of types exists, and if so call it directly without doing any testing. It would not work if for example you had made an "identifier" type which was a subset of strings, so then you couldn't use it. So as an implementer you'd have to know and use it judiciously, hence this form of dispatch couldn't be used. The biggest drawback is that if someone subclassed Array, say, it would have a different "natural type" and when they passed it in, it wouldn't find the Array implementation and have to fall back to regular, slow, check all the types dispatch. So maybe it's not that useful an idea, I just thought I should bring it up because it would be nice to short-circuit all of that type testing however often as possible, and in practice I see that all lots of implementations are doing is calling the number implementation on numbers, the string implementation on strings, etc. As long as that's all you're doing, "natural" type dispatch would work very well.

Overall, I think we are in fact converging, though, which is good :)

josdejong commented 2 years ago

3.

Sounds good. Let's discuss on (im)mutability separately :thumbsup:

4.

I think you're right. In practice you typically need only one, maybe two signatures, so it will not grow unwieldy. Ok then let's go with your proposal, and with a function as API. I like your shorter naming too, though maybe it should be singular? I still would like to have an explicit function to refer to self. So the API can become:

typed.referTo(signature1, signature2, ..., function (fn1, fn2, ...) {
  return function (args) { 
    // ...
  } 
})

typed.referToSelf(function (self) {
  return function (args) { 
    // ...
  } 
})

Is that good to go like this? Or do you have more remarks/refinements/ideas? When we both agree, I'll work out the experimental PR with "Option B" accordingly (beginning of next week I expect).

6.

So the idea is that if you know the type system in your typed instance is consistent with natural types, you can dispatch by computing the natural types of all of the arguments, checking if the signature consisting of that list of types exists, and if so call it directly without doing any testing.

Ahh, I think I get what you mean now: you want to utilize the knowledge that there is about the defined data types, signatures, conversions, to create cheaper and faster dispatching. It is a really interesting idea to see if we can make this smarter and faster. I think there are indeed possibilities to improve on this, it would be an interesting experiment. I expect though that this would be an optimization "under the hood" that will not impact the public API.

gwhitney commented 2 years ago

4) Is that good to go like this?

I don't have any further "amendments" to offer on option B, syntax variant 2.2, with the name referTo and the signature you give above. And I do like that referring to self has become so it looks slightly more cumbersome than referring to another specific signature. (If you want to amp that up with a name like typed.referInefficientlyToSelfBecauseIreallyWantTo, feel free ;-)

6) I expect though that this would be an optimization "under the hood" that will not impact the public API.

Ah, I was thinking of it as a tool that implementors could use to speed up redispatch, because I was/am dubious about the ability for typed-function's code to deduce when such dispatch shortcutting is valid. So something like a typed.naturalDispatch(...) analogue of type.referTo(...) But this has now like (im)mutability become an orthogonal issue to "Option B" that could be added later; and it definitely does not feel as urgent to me as #138. So I will just open an issue with one particularly promising variant your latest comment made me think of, and I don't think this needs more discussion here.

josdejong commented 2 years ago

(4) OK then let's go for it 💪, thanks for the good discussion. I can also name the self referencing function typed['referToSelfButDontPromoteIt' + Math.round(Math.random() * 1e6)] , how about that? 😉 😂

(6) Sounds good

josdejong commented 2 years ago

I've updated PR #137 according to the final API, @gwhitney can you review this PR?

There is one open issue regarding backward compatibility warnings.

gwhitney commented 2 years ago

Ok I beat on the PR pretty hard and have put in all the review comments I came up with. Of course you should take them all as suggestions, except there's one I described as pretty critical, for reasons outlined there.

josdejong commented 2 years ago

Thanks 👍 I'll have a look tomorrow

josdejong commented 2 years ago

137 is merged now, I think we're ready to publish `v3` (and a last version of `v2` too, before that). I'll publish `v3` tomorrow.

gwhitney commented 2 years ago

Ummm, any chance of including something for #138 in v3? It's possible it might cause something minorly breaking the interface. I have been hoping that "picomath" would be convincing that it's worth having this feature in typed-function, even if not right away. I had been waiting for v3 to settle a bit before working up a PR but if you are interested I will get on it right away. Just let me know.

josdejong commented 2 years ago

OK I'll read up on your last comments there tomorrow before doing any publishing of v3, it's diner time here :)

josdejong commented 2 years ago

I've commented on #138. I think though that this #138 will take serious time, so my feeling is that we should not let v3 wait until that is finished up. What do you think?

josdejong / typed-function

Allow "recursion" calls to a specific signature more easily/naturally? #126

Option A: replace this with typed.self

Option B: replace this with typed.reference(self => {...})

Other thoughts

1.

2.

3.

4.

3. referencing outside defined variable "self"

2. thinking through the API

5 additional signatures or only original signatures

6. natural JS type

3.

4.

6.

137 is merged now, I think we're ready to publish v3 (and a last version of v2 too, before that). I'll publish v3 tomorrow.

Option A: replace `this` with `typed.self`

Option B: replace `this` with `typed.reference(self => {...})`

137 is merged now, I think we're ready to publish `v3` (and a last version of `v2` too, before that). I'll publish `v3` tomorrow.