Some comments - Githubissues

josdejong / pocomath

A little proof-of-concept for organizing mathjs by module inclusion, avoiding factory functions.

Apache License 2.0

3 stars 0 forks source link

Some comments #1

Open gwhitney opened 1 year ago

gwhitney commented 1 year ago

So far I just looked at the things you've done in the complex subdirectory, because I pretty much believe that without any generic types like Complex<T> or Vector<T> there would be no major hitch in doing things along these lines. I can't yet say that it will or won't work even so, but here are the concerns I see with the ones you have done so far:

In isReal.ts you have defined the generic function type IsReal, using up that generic. But undoubtedly we would want to have an isReal implementation for number that would always return true. How would that be typed, since you can't "extend" a generic? It's also slightly misleading/inconsistent in that elsewhere in the code, if I want to get the function type of add on arguments of type Foo I do Add<Foo> but if I do IsReal<Foo> I don't get the function type of isReal on an argument of type Foo, I get the function type of isReal on an argument of type Complex<Foo>. How do I remember that? I am worried about not having a uniform way to obtain the function type of the operation with a given name on a given tuple of parameter types...
I really don't like that to specify the dependencies on add and equal, you have to mention the name of each one three times: once in the argument, again in the type of the argument, and again in the template that gives the specific type of the dependency. I worked really hard to avoid such redundancy in typocomath. The first scheme had none, but when I couldn't get it to work, I accepted for now a doubling of the mention of the name of an operator, but only where it is defined: once in the name of the function and once in the template that checks its type. There is still only one mention of the operator name in each use as a dependency, so that's not as bad as it could be because there are many more uses than definitions. And I'd still like to get rid of the doubling of the name at definition time, but haven't quite figured out how to do that yet (there are more pressing problems like just getting the whole thing to compile).
Then in arg.ts yet a third way to obtain the function type of an implementation is introduced: ArgComplexNumber (as opposed to Arg<Complex<number>> or Arg<number>, which I might guess it would be one of from the first two examples of Add and isReal. And if later other implementations of arg are added for other parameter types, it seems like we've started down the road of combinatorial explosion of type names; we might end up with an ArgComplexBigint and an ArgFoo and an ArgQuaternionNumber etc. etc.
Also I thought we wanted to do away with the extra layer of function call if there are no dependencies? In other words, the definition of arg there should just be infer: z: Complex<Number> => Math.atan2(z.im, z.re), shouldn't it? Certainly in typocomath I have been leaving the dependencies off altogether when there are none, and I do think we can get Dispatcher (typed-function) to handle that.
In complex.ts, not sure what I would do with the ComplexFn<T> generic -- there's not much TypeScript can do with unions of function types, sadly. And this introduces a fourth mechanism for finding the function type of an implementation: never mind the Fn suffix for now, since surely we can work out a naming convention that avoids that, but I now have to know/remember that to get the unary function type I use a 1 suffix and the binary I use a 2... I have a real worry that along these lines it will become a constant pain to look up/remember how to get the function type for each different dependency of a new operation I am writing. I am strongly in favor of a single minimally redundant notation for dependencies, based on my experience with writing current mathjs operations.
Incidentally (this is a minor point), as far as I can tell there is no way to actually make zero() a nullary function. Remember at runtime it is going to have to actually return the correct zero value, at least among number 0 and bigint 0n, which are different entities, and all type information is gone at runtime, so it has to have something it can use to tell what zero to return, so it needs to take an argument of the type of thing that the zero is wanted for...
In quaternion.ts, since you used the ComplexFn<T> function type that is completely ambiguous between the unary and binary complex(), there could be no way for the Dispatcher (typed-function) to tell whether to supply the dependency as the unary or binary version of complex. So this would really need to be ComplexFn2<T> for there to be any hope for the typed-function to get built properly.

OK, hopefully that makes some of the issues I have been grappling with clearer. I will add another batch of things to typocomath and then you can have a go at getting all of that working.

josdejong commented 1 year ago

Thanks for the feedback Glen! I've added some commits addressing your points.

Ok I've extended isReal.ts to implement two signatures: isReal(number) and isReal(Complex<T>). Of course we can organize it in different files if we want.
Yes agree. The name repetitions remind me of the years I worked with Java 😂 😢 . I've extended generic/absquare.ts with a couple of different approaches:
- One cause is using destructuring or not: { square, abs } in the header, or dep.square and dep.abs in the usage. I personally prefer destructuring, since that way the actual logic is cleanest. But I think we do not perse need to enforce this, both can be fine.
- The other cause is writing dependencies as an object or as an intersection type:
```
// object
deps: {
  square: Square<T>,
  abs: Abs<T>
}
```
```
// intersection type:
export type AbsDep<T> = {
   abs: Abs<T>
}
export type SquareDep<T> = {
   square: Square<T>
}
// ...
deps: AbsDep<T> & SquareDep<T>
```
  This is personal preference too I think. At this moment I have a slight preference for the first: it is a bit more verbose, but it is so easy to understand, and interfaces like Square<T> are more powerful than the more specific SquareDep<T> which is usable only in one specific case.
Huh? It is not a new way, to me it is just the same as all other functions. I just literally translated the implementation of arg.mjs, which is an implementation for Complex<number> and not a generic implementation. The interface just matches the function. If we want to introduce a generic interface, we can, just like the other functions:
```
// we should move Arg and ArgComplex into /interfaces/ 
export type Arg<T, U> = (z: T) => U
export type ArgComplex<U> = Arg<Complex<U>, U>

// now, we can write out ArgComplexNumber and ArgComplexBigint. 
// this is probably redundant though
export type ArgComplexNumber = ArgComplex<number>
export type ArgComplexBigInt = ArgComplex<bigint>
```
Now, arg may be a bad example, but I was thinking about a function that has a very specific interface that is only implemented by a single function, like say polynomialRoot(constant, linearCoeff, quadraticCoeff, cubicCoeff). In that case it makes sense to me to put the interface alongside the implementation.

So the general pattern that I have in mind here:
- we can just use bare bone TypeScript types without any magic. We can reuse them everywhere, we can put them wherever we want. We have all freedom.
- When there is a (generic) interface that has multiple implementations, we can put the interface in a shared place and reuse it everywhere.
- when we have a very specific interface that has only one implementation, we can put it alongside the implementation itself.
Yes agree, it will be nice to omit dependencies if empty. I have to say, they do not add too much clutter, but it would be nice I think. In the pocomath approach, I can remove the {infer} wrapper object, then you get:
```
// with {infer} object and empty dependencies 
export const add = {
    infer: (): AddNumber => (a, b) => a + b
}

// without 
export const add: AddNumber = (a, b) => a + b
```
This pattern is very similar to typocomath:
```
export const add: ImpType<'add', [number, number]> = (a, b) => a + b
```
However, in both cases, we still have to somehow make clear to mathjs that this is a function that should be inferred from the TypeScript types. How do we do that? We could wrap the function in a call infer(...):
```
export const add: AddNumber = infer((a, b) => a + b)
```
Ow, there was indeed an issue with the type of ComplexFn<T>, you could not use it with one parameter. I changed it to type ComplexFn<T> = ComplexFn1<T> & ComplexFn2<T>, now you can use it both as complex(re) and complex(re, im), for example in quaternion.ts.

Your other point is an important one: how to figure out what dependencies are available? In typocomath there is a clear unified way to write things: you use ImpType<'add', [T, T]> and Dependency<'add', [T, T]>. In my pocomath+ts experiment, you define a Add<T> = (x: T, y: T) => T and use Add<T> at both implementation and dependency, but you could use different function namings like ComplexFn<T>, so a clear naming convention will be useful. In both cases you have to know the name of the function. In both cases you need to know whether the function actually exists. In typocomath you can enter any non existing dependency like Dependency<'foobar', [T, T]>. In pocomath+ts you can only use defined interfaces. Still, in both cases it could be that there is no runtime implementation available.

I do not understand what you mean with "this introduces a fourth mechanism for finding the function type of an implementation", can you explain that? To me there is only one, which works like:
- I want to use a new dependency add in my function
- In my IDE, I start typing "add", I press Ctrl+Space to open autocompetion
- The autocompletion of the IDE presents me with two interfaces: Add<T> and AddNumber
- I decide I want to use the generic one, and press Enter to insert the interface
- Since the interface exists, I know for sure that the function exists and the name and parameters are correct. It could still be that a runtime implementation is missing though.
- That's it...
How will actual usage work in typocomath? Right now I do not see a way to verify whether some function can actually exist. There is no autocompletion or anything when I'm typing Dependency<'add' ..., I have no idea about whether the function name exists at all or what arguments it would expect. I have the feeling that pocomath+ts gives more compile time guarantees and help then typocomath at this point.
Ahh, yeah good point. I'll change zero to how it was in pocomath, with one argument
I do not fully understand how quaternion should work ideally, and I just translated your implementation of quaternion.mjs straight into quaternion.ts. Why whould the former work and the latter not? (Also: maybe we do not need to spent to much time on quaternion and see it as an edge case).

gwhitney commented 1 year ago

Excellent, your responses really clarify some of the differences between our approaches.

A) I type code in a plain text editor and rely on the compiler. So for me, it's very important to have one syntax for grabbing an implementation: give the name of the operation and the types you want to call it on. So the four ways in your approach that I mention in my points are all different syntax variants for how you would actually do that grabbing, that I would have to remember or look up every time I wanted to get an implementation. For you, all that variation doesn't seem as problematic because you feel the IDE will do the lookup for you and present you with alternatives and it will be easy to choose. But I think that for folks trying to read the code, the variation in syntaxes is still confusing, and even using the IDE the fact that for some ops foo, Foo<T> is the one that takes a single Complex<T> argument and for other ops it is the one that takes two arguments of type T is really confusing. Compare that with looking at the latest version of the complex implementation of sqrt in typocomath, to pick one with lots of dependencies: it is crystal clear which other operations it uses and what types of arguments they are called on. So to me having many different naming conventions for the type of an implementation feels ad hoc and unscalable, while typocomath is systematic and will stay that way up to mathjs-size.

Since this is long I will mention other differences in another comment.

gwhitney commented 1 year ago

B) I feel the syntax TypeScript chose for typing destructuring parameter lists is simply broken as it always forces redundant writing of property names, and hence it should just not be used. And I feel we should strive very hard to avoid boilerplate repetition needed in the code; that's why I am pretty upset that right now typocomath is mentioning the operation name twice in each implementation definition, I do hope to fix that someday, but at first it was more important just to get it all compiling. And I don't see any difference in "power" between writing the Dep types or the implementation types: the dependency type, the implementation type, and the return type of an implementation are all intertranslatable with simple type operators. That's why typocomath provides ImpReturns<Name, Params>, ImpType<Name, Params> and Dependency<Name, Params> so you can obtain any of the three that you want, in a completely uniform way. But it feels to me like these type operators are uncomfortable to you, even though your pocomath+ts poc is using lots of generics, which is all type operators are...

gwhitney commented 1 year ago

C) You are currently thinking we need to tag implementations where Dispatcher/typed-function needs to figure out the call signature from the typescript typing. I am assuming that it will be the only way. So no need for any tagging with something like 'infer'. But on the other hand, typing a dependency with something like ComplexFn that could work for either unary or binary (even with your revision) will make it impossible for the Dispatcher to tell whether to supply the unary or binary version to fulfill the dependency. A uniform syntax like Dependency<'complex', [T,T]> vs Dependency<'complex', [T]> will make it straightforward for the Dispatcher. Another thing to know here is that typescript-rtti supplies the local, "syntactic" types of entities, not what you might think of as the "deep" or "fully resolved" or "semantic" types of entities. In other words, with a variety of syntaxes for the dependencies, we will be forcing ourselves to write more complicated code in the Dispatcher to parse all of the possibilities, like it will have to decode the type name ArgComplexNumber and ComplexFn, etc.... and how it will tell that say Add<T> takes two arguments of type T, while Conj<T> takes one argument of type Complex<T>, I really don't know. Maybe there is a way in typescript-rtti to look up the definition of a generic (which may be totally elsewhere in the code) but I am really unsure about that...

gwhitney commented 1 year ago

D) it is too bad (some?) IDEs don't show you valid operation names in the first parameter position to the Dependency<> operator. I guess it's because syntactically that parameter is only bounded by string. (But the compiler definitely knows, it won't accept a typo in the name...) I guess I could try writing a narrower type bound on Name in the definition of that operator, to restrict it to just the valid keys, and see if that helps the ide. But personally I think that should be icing that comes after, and ide behaviors should not influence our choice of scheme, since people use all sorts of IDEs or none at all, and more importantly they do lots of reading of code without an IDE. So the more the code can stand on its own without IDE features, the better.

gwhitney commented 1 year ago

E) I am more worried about uses of implementations than definitions of them, since there's a lot more of the uses. That's why I am willing to accept the admittedly unfortunately complicated typing of implementations in the current typocomath scheme, although I have eased it somewhat in the latest version, because it supports totally uniform usage. I definitely would have preferred for TypeScript to deduce all the type info it needs from the definitions themselves; unfortunately I just couldn't get that to work, I think because of TypeScript's limitations in dealing with generic function types, but maybe just because I wasn't clever enough. I did verify on stack exchange that it's impossible in typescript to get the return type of the specialization of a generic function that matches a given call signature and that others have asked for that same feature. Also the current type scheme in typocomath is extremely close to the "workaround" that typescript type expert jcalz presented. So that's why I am not putting more effort into the first nicer-looking attempt, at least not at the moment -- I worry it simply can't be made to work. And anyway, even in your pocomath+ts you are defining types for implementations separately from the definitions of those implementations, so the current typocomath isn't worse in that basic way. As far as I see it, the only really substantive difference between the current typocomath and pocomath+ts is that the former trades off more intricate typings of the implementations for more uniform dependency specifications.

gwhitney commented 1 year ago

F) I guess there is one more difference: your solution to my concern about isReal was to create one overarching generic that should work for all types. But I think that should be avoided: in typocomath I have been careful that the typing of each implementation only types the instances of that operation that that implementation handles. Since isReal has different implementations for number and for Complex<T>, it should not assume a signature for all types T. The number implementation should provide just the typing for number parameters and the Complex implementation should provide just the typing for Complex<T> parameters, etc.

For example in typocomath nowhere is there any assumption or assertion that for all types T, multiply of a T and a T yields a T. Thus it should work if we wanted multiply on a Vector type to be the dot product so that multiply of two Vectors returned a number, everything should still work... we might or might not use that option but I bet there are places where the desired return types of operations do vary more than isReal so the "escape" of just making one general template won't be available. And I have tried to make the case that it's ill-advised even when available, as it's constraining future behavior in a way that may or may not turn out to be desirable.

gwhitney commented 1 year ago

Ok, that's all the thoughts I have at this round. So are you willing to try to type the latest version of typocomath (branch signature_scheme) completely according to how you envision so that the whole thing compiles and runs the dummy registration? Then we will really have two directly comparable versions for evaluating which way to go. Sorry that's some repeated work for you, sorry I didn't get more done on typocomath before you had the opportunity to jump in on pocomath. But it has been quite a struggle to support my goal of one clear syntax for specifying a dependency. Let me know if you're OK with doing that process again on typocomath. Thanks.

josdejong commented 1 year ago

👍 I'll read up tomorrow and see if I can create a typocomath variant based on the second approach.

josdejong commented 1 year ago

I think it will be helpful to schedule another video meeting to talk things through after that.

josdejong commented 1 year ago

@gwhitney OK I've pushed two new branches to typocomath. I think it is time to try compare the different approaches, see what their pros and cons are, and understand how important these features are for both of us (that differs :) ).

I think it will work best to plan another video call to discuss the pros and cons, instead of typing more, we'll understand each other more quickly then I think.

I've named the approaches as follows:

approach1: pocomath (JS)
approach2: ImpType/Dependency (TS), branch signature_scheme
approach3: plain types (TS), branch appraoch3_plain
approach4: typealias (TS), branch approach4_typealias

I've started an Excel sheet where we can list features and compare how they work out:

https://docs.google.com/spreadsheets/d/1RWIOddTsaUpakJ0W0cQfFyr8WaVMphiTwXWkijzlwvc/edit#gid=0

I'm sure some of the "features" I list are not clear to you, and I'm sure the features are currently biased towards my viewpoint. Can you please add missing features that you think are relevant, fill in what features you find important, and if features are unclear, just ask me?

gwhitney commented 1 year ago

OK, I checked, and true, both approach3 and approach4 branches do compile and run all the dummy registrations. But they are not actually yet ready/appropriate for feature comparison with approach 2 as represented by branch signature_scheme or pocomath: in just the one test case I tried, they do not correctly type absquare (of course that's the one I tried first because I know from experience that it is a real typing challenge). To see this, add the following code to index.ts; it is a mockup of instantiating absquare for quaternions. (Please hold off a moment on your reaction that quaternions are an edge case that we don't "really" need to support; I believe they are actually an excellent probe as to whether we have gotten the typing "right", and all of the problems that show up there will be reflected lower in the hierarchy like at just Complex<Fraction> say and with all other templates that we want to do like Matrix<T> especially nested instances like Matrix<Complex<number>> which we definitely do want. So please grant for the moment that they are actually a very good laboratory/window into whether the typing is working. Then yes it is just icing on the cake that when the typing is all correct, we get quaternions with no additional code via Complex<Complex<number>>.)

import {Complex} from './Complex/type.js'
import {absquare as absquare_complex} from './Complex/arithmetic.js'

const mockRealAdd = (a: number, b: number) => a+b
const mockComplexAbsquare = (z: Complex<number>) => z.re*z.re + z.im*z.im

const quatAbsquare = absquare_complex({
   add: mockRealAdd,
   absquare: mockComplexAbsquare
})

const myabs = quatAbsquare({re: {re: 0, im: 1}, im: {re:2, im: 3}})
const typeTest: typeof myabs = 7 // check myabs is just a number

console.log('Result is', myabs)

Indubitably mockComplexAbsquare produces the square of the absolute value of a Complex<number>, and indubitably mockRealAdd adds up two results of mockComplexAbsquare. So I should be able to pass them as the dependencies of the absquare defined in the Complex subdirectory to get an absquare function that works on quaternions. And indeed, when I add the above lines to index.ts in branch signature_scheme, it compiles and console.logs the correct result of 14 = 0² + 1² + 2² + 3². And we know the analogous thing works in pocomath because of the unit tests. But the above code doesn't compile in branches approach3_plain or approach4_typealias.

So pardon me for being blunt but I am feeling like the only reason that you feel the typing can be done so "easily" as it appears in approach 3 or approach 4 is that you haven't yet had the opportunity to really grapple with getting a consistent core of a revised mathjs all working actually the way it needs to, in order to build up to a large-scale, consistent system. There are real typing challenges that neither approach3 or approach4 yet addresses. And once again, please don't dismiss quaternions as irrelevant; even if we don't actually care about providing quaternion functions, they are an analogy for any nested generic -- just the easiest analogy accessible in a core of number and Complex. We will definitely want nested generics, Matrix<Complex<number>> at the very least, but I assume once generics have really been integrated into mathjs there will end up being plenty of other examples.

So how do you want to proceed? Do you want to try to get one approach or the other or both to actually type this core of functions accurately? Or would you like me to try to get something like approach4 working plausibly well?

The difficulty is that for me, approach 3 is a complete non-starter for an actual direction for mathjs; as far as I can see, it means reiterating the entire type of every implementation every place it is used, which as far as I can see will be a complete non-DRY disaster as we scale up to mathjs-size, not to mention super-tedious. So as long as I am in blunt mode, I'm personally not really interested in putting any effort into approach 3. Approach 4 is more plausible, but it still appears to have one flaw I see as key, which I will mention in the next comment.

gwhitney commented 1 year ago

My worries concerning approach 4: As far as I can see, it attempts to provide universal generic typings for all operations. I am anxious about this in two ways:

1) I am not convinced based on experience that TypeScript's type operations are suitable for all of the actual general relationships between input and output types of actual mathematical functions we want mathjs to comprise, if we go about it this way. 2) It represents a real shift in the typing philosophy of mathjs. So far, if you look at the implicit typings of current mathjs, or at the pocomath prototype, there is nowhere an assertion that the type of multiply on two entities of the same type is again the same type. There is just the conjunction of the assertions that a number times a number is a number, a BigNumber times a BigNumber is a BigNumber, a Fraction times a Fraction is a Fraction, etc. I am very worried that abandoning this approach for the sweeping generic approach of enforcing a T times a T to be a T will put a straitjacket on us that will not actually allow us to model the mathematical realities properly. For example, it certainly means we could never have a NegativeNumber type, since a negative times a negative is a positive. Of course, maybe we would never have a Negative type, but making this choice means we couldn't. And we couldn't make the product of two vectors be a number. Again, maybe we'd always have a different operator like dotProduct, but with a decision like this we are tying our hands. And there are dozens and dozens of operators. Any one of them might run into typing issues where they don't "fit" the generic. For just one more example that comes to mind, take floor. From floor of a number being a number, and it being the trivial identity operation on bigint, we might conclude that type FnFloor = <T>(x: T) => T. But then maybe we actually want floor(BigNumber) and/or floor(Fraction) to be a bigint, to capture at the type level that the result really is an integer. I think from a typing perspective, those would be good typings. But we can't just make type FnFloor = <T>(x:T) => bigint because I think we want a floor(Matrix<number>) to be either Matrix<Number> or Matrix<bigint>. You've talked about having implementation freedom; I think we want the freedom to just type only the specific return types of specific implementations, and not have to worry about conforming to generic typings that are going to constrain lots of implementations. Writing (and typing) just one implementation at a time seems much more in the spirit of mathjs as it stands.

And so that is what has led me to something like approach 2: it seems to me there needs to be a way to collect up individual "pieces" of the overall typing of a multi-way, run-time-dispatched operation 'foo' and combine them into some assembly from which TypeScript is then able to extract the typing for a specific instance of 'foo' and check it is being used consistently. And the only mechanism I could find in TypeScript for "collecting up" types like that is to extend an interface using the "declare module" syntax. So that's how I've ended up at approach 2. I agree that we/I should continue to try to simplify/polish how the types of implementations are specified, but that's the only hitch I see. Using a properly typed implementation of an operation is super easy and clear.

(I guess I can think of one other possible approach, namely to decide on a name mangling of operations and input signatures, and try to supply types for all of them, like add_number_number and absquare_complex_number. And then we will need some scheme that will give generic type operations for things like add_T_T or absquare_complex_T. I haven't yet seriously tried to get something like this to work because it seems to me like it needs a compile-time translation from types to literal string names, i.e. deriving 'number' from number. The other direction from strings to types is not too bad but types to strings, I don't really know if that can be done. But if we are absolutely at loggerheads on other approaches I could maybe give something like this a try.)

Anyhow, I want to make it absolutely clear that I am not being obstructionist here or trying to insist that approach2 is the only way. Remember, I previously had a scheme that extracted types directly from the implementations, and it worked at first, but I couldn't extend it to more complicated functions either because of inherent limitations in TypeScript or my lack of cleverness. And I would love it if you or me or anyone is able to come up with a DRY "direct" typing of all of the implementations of all of the operations. I just haven't seen or thought of any simpler scheme to actually do it than approach 2.

josdejong commented 1 year ago

Ok I've pushed type fixes to both branches to make the test code pass, good point. There are a couple more similar cases, we can fix that in the same way. They output 14 though, not 7, but that is an implementation issue and not related to the types.

The thing was that for abssquare (and similar functions), the output does not necessarily is the same type as the input (like is the case for say type FnAdd<T> = (a: T, b: T) => T. That can be solved by adding a second generic type FnAbsSquare<T, U> = (a: T) => U. When having an implementation, you can use that like FnAbsSquare<number, number> and FnAbsSquare<Complex<number>, number>, etc.

gwhitney commented 1 year ago

OK, I will take a look. 14 is the correct answer. Setting the variable to 7 was just a type check.

gwhitney commented 1 year ago

you must not have pushed approach4, still doesn't compile for me.

josdejong commented 1 year ago

you must not have pushed approach4, still doesn't compile for me.

Sorry, done now

josdejong commented 1 year ago

At this moment I (still) do not see the added value of approach2 over a simple approach like 3, whilst it introduces a lot of complexity. The main difference I see is that return types are automatically inferred, but in practice I found that to be a drawback too: when refactoring to approach 3 and 4, it was difficult to figure out what a dependency was supposed to return, you can't easily see that. So I doubt if that really is a positive thing.

I like approach 4 quite a lot: it is simple and straightforward, and in the spirit of TypeScript. It prevents entering dependencies wrongly (you can't enter parameters wrongly, you can't enter a non existing function, you immediately see the types of parameter and return type, at least in an IDE 😁). And it is much more DRY than approach 2 and 3, where you have to enter full signatures again and again for every dependency (like (a: T, b:T) =>T or Dependency<'add', [number, number]>).

I hope my Excel sheet can clarify some of the pros/cons.

josdejong commented 1 year ago

Shall we plan for a video call to discuss further? I think we'll understand each other faster that way.

josdejong commented 1 year ago

It represents a real shift in the typing philosophy of mathjs

That is a good point, I'll give that some thought. (the alternative can be for example approach3, or a mix of 3 and 4)

gwhitney commented 1 year ago

We can certainly do a video call next week sometime if you like. Send me some options.

But we can forget about approach 3, right? We are certainly not going to write out the full signature for each dependency every time it is used, right?

There is nothing non-DRY about the selection of a dependency in approach 2: you specify only the name of the operation you want to depend on, and the tuple of types that you want to be able to invoke that operation on. The Dispatcher needs the latter information to be able to supply your implementation with dependencies as direct, non-dispatching functions -- that is exactly where the ~9x speedup of pocomath over current mathjs comes from. And let's be honest, the only reason that you can currently get away with the FnAdd<T> in current approach 2 is that we are still doing a toy example. Actually add should be able to take just one argument, which it returns directly, and also a whole sequence of arguments, which it adds up and returns the best common type that they can all be added up/converted into. Approach 2 at least has a fighting chance to capture all that, whereas I literally have no idea how you would couch that in approach 4 (although as I said, I am willing to give it a try).

Moreover, current approach 4 with the latest update captures no relationship whatsoever between the input and output types of absquare. It will be just as happy being given an add that takes two strings to a string and an underlying absquare that takes the real type T to a string, and it will give you an absquare that takes a Complex<T> to a string, which is nonsense. And as I said, I just looked at the one example of absquare itself. The typing of the dependencies of sqrt is still wrong. It says absquare takes a Complex<T> to a T, but that isn't right; absquare takes a Complex<Complex<number>> to a number, not to a Complex<number>, and the code won't work if the underlying absolute value is complex, the formulas rely on the absolute value being real.

In other words, the current typing of absquare, with an arbitrary type U in the return position, is just about equivalent to saying that it returns any. We're not getting any additional type safety from it. And pretty much as soon as we make return types be any, then all of the functions will have to take any, and we won't be getting any advantage from TypeScript internally at all, and so we might as well revert to approach 1, a slightly revised version of pocomath. (Which frankly would be just fine with me; I feel like TypeScript is broken in a lot of ways and the pain it is causing us is not worth whatever internal safety it may eventually give us; I am happy to create a system that provides correct, reliable external TypeScript typings for mathjs for others' use, but I am really not interested in coding in TypeScript, I have only learned as much as I have about TypeScript solely for the sake of trying to move the mathjs project forward and get to the point where we can comfortably extend it to bigint, which it sorely needs I think.)

In other words, I think a big part of the reason approach 4 still feels "easy" to you is that now it is not really much constraining the implementations, so we are not really getting much type safety, the TypeScript is mostly just added baggage.

josdejong commented 1 year ago

I sense a bit of frustration 😉 . Maybe good to let this rest for the weekend

I know you would be happy with just a JS pocomath implementation. And I hate fighting TS myself. Because of that, I am surprised that you're trying so hard coming up with a "perfect" TypeScript solution. I have the feeling we're over-engineering this. I would love to see a simple, pragmatic TS solution that handles like 80% of the cases. In the end, the JS pocomath core is what matters, and this will throw a runtime exception when a function signature cannot be resolved. Both for JS functions and functions inferred from TypeScript. All extra type safety we get on top of that is bonus to me. From that point of view, I would already be happy with the "dumb" approach 3, that API is a more verbose than the JS pocomath API, but not that bad in my opinion.

gwhitney commented 1 year ago

As far as I can see, the only added complexity in approach 2 over 1, 3 or 4 is the intricacy of typing the implementations so TypeScript can deal with them. But that complexity is because getting those typings right is actually hard, as I am hoping you will eventually see in trying to get approach 4 to be both correct and usefully constraining (which it is neither of at this exact moment). It is sufficiently hard that TypeScript wasn't able to do it directly from the implementations themselves (even though all of the information really is there, and TypeScript is a reasonably "mature" language). So we have to spell out how to figure out the return type of an operation from its input types for it manually, and it should be no surprise that for some operations that relationship is intricate, and hence tricky to specify. Let's be clear that for the bulk of "ordinary" operations that take specific concrete types, like 'combinations', the specifications of the type will just be Signature<Params, [number, number], number> for the number implementation and Signature<Params, [bigint, bigint], bigint> for the bigint implementation. The tricky specifications are just for the complicated ones, and also (at least right now) for many generics like lsolve because TypeScript generics can't take generic parameters, so unless a better way of typing is found, lsolve will have to be something like the following, where I am currently assuming we just have a single matrix type that can have any number of dimensions like mathjs does at the moment)

type MatrixAndElement<T> = [Matrix<T>, T]
declare module "./type" {
   interface MatrixReturn<Params> {
    lsolve: Params extends MatrixAndElement<infer T> ? Matrix<T> : never
  }
}

I totally understand and admit that's pretty intricate. I just don't at the moment know how else to capture the relationship between the input parameter types and the output types of a generic function. I mean, it should be something like

[implementation of lsolve goes here, let's say it's the function lsolve_imp for clarity]

declare module "./type" {
   interface MatrixReturn<Params> {
    lsolve: typeof lsolve_imp<infer T> extends ((...args: Params) => any) ? typeof lsolve_imp<T> : never
   }
}

but TypeScript can't infer in that position, and the current "main" branch in typocomath tried something automatic more or less along those lines and I couldn't get it to work; it was coming up with "unknown" types in places where it really seemed like it should be able to infer a more specific type. I think I tracked it down to places where TypeScript was deliberately punting on type inference by design for "pragmatic" reasons, but maybe I should revisit that scheme and try harder?

gwhitney commented 1 year ago

I sense a bit of frustration

Absolutely. The frustration is entirely from my inability to communicate to you the real difficulties with something like approach 4. I keep hoping that as you grapple with trying to get approach 4 actually correct, you will reach that aha moment that to get correct typings that neither over or underconstrain the implementation really requires a mechanism at least roughly along the lines of approach 2. But as I said, I am willing to branch from approach 4 and make a shot at "Suppose we say that for each operation, there is going to be one overarching generic typing that all implementations will have to comply with. Then what could the typing look like?" I remain worried about that typing philosophy as I expressed above, but I am willing to give it a shot. Should I try that?

gwhitney commented 1 year ago

I would love to see a simple, pragmatic TS solution that handles like 80% of the cases.

I don't think I really know what you mean by that. As soon as you TypeScript type an implementation so that it usefully constrains that implementation, then all of its dependencies have to be typed to that same level of constraint, and all of the implementations that use it have to accept that level of constraint (or else discard the constraints with 'as' typings, but then why in the world use TypeScript if you're going to do that?) So in the "tree" (it's not quite a tree) of all implementations you need to fully type any connected component.

And if we don't TypeScript type all implementations, then we are back to also needing a pocomath-style signature typing in the names of the implementations we don't TypeScript type, and if we are going to use that, why not just use it everywhere and avoid the trouble of two systems? After all, we could enhance pocomath to have a mode in which types are continuously checked throughout, and use that for unit testing, and just have it compile away in the published bundle to avoid the overhead. That would provide every bit as much type safety internally as using TypeScript -- it would just act as a type-checking processor for our JavaScript code, which is all TypeScript is anyway. So I don't understand what an 80% solution here is. I literally am struggling my absolute hardest to get anything that compiles and correctly and usefully constrains implementations, and so far approach 2 is the only thing I have been able to come up with.

I would already be happy with the "dumb" approach 3, that API is a more verbose than the JS pocomath API, but not that bad in my opinion.

All right, I will be the bluntest yet: I am very happy to

give approach 4 my best shot
try to continue to hone and simplify approach 2
give the predecessor of approach 2 one more try (where the type information would come directly from the implementations)
do a slight revision of pocomath to make the return type specification more "natural" in most cases and go JS-only except for emitting .d.ts files
try to brainstorm an "approach 5" like the name-mangling one I mentioned in a comment way above

but I am not interested in putting in significant hours coding in a scheme where every time you want to have a dependency on 'parse' called on a string and a ParseOptions object, returning a Node, you need to write something like (dep: {parse: (expr: string, options: ParseOptions) => Node}) => ...blah blah... That would feel totally like having made no progress from the redundancy inherent in the current mathjs, and I just won't be able to do it. Really sorry. The exact information you should need to specify is the name of the operation you want to call (parse) and the list of types you want to call it on (string, ParseOptions). I don't care what syntax that's specified in: parse(string, ParseOptions) in a object key, or Dependency<parse, [string, ParseOptions]>, or ParseImp<string, ParseOptions> or parse_string_ParseOptions_type etc. etc. But specifying the whole function type over and over again redundantly in a way that has to be consistent in every copy is unfortunately unacceptable to me as a place I am going to devote the kind of time I have been investing into mathjs. (Note this doesn't mean you shouldn't adopt such a scheme if in the end that's the way you think mathjs needs to go; I can of course move on to other projects/options, and you need to judge what's the best for your package.)

Thanks for understanding.

josdejong commented 1 year ago

@gwhitney I'm still thinking about whether we can come up with an approach5.

The complexity of approach2 comes from automatically inferring the return type. Suppose that we would be willing to enter the return type too in the dependency, then the solution can be as follows (without need for declare module sections):

export type Implementation<
    Name extends string,
    Params extends unknown[],
    Return extends unknown
> = (...args: Params) => Return

export type Dependency<
    Name extends string,
    Params extends unknown[],
    Return extends unknown
> = {
    [N in Name]: (...args: Params) => Return
}

export const sqrt =
   (dep: configDependency 
     & Dependency<'complex', [number, number], Complex<number>>
   ): Implementation<'sqrt', [number, number], Complex<number> | number> => {
      if (dep.config.predictable || !dep.complex) return conservativeSqrt
      return a => {
         if (isNaN(a)) return NaN
         if (a >= 0) return Math.sqrt(a)
         return dep.complex(0, Math.sqrt(unaryMinus(a)))
      }
   }

export const multiply =
   <T>(dep: Dependency<'add', [T, T], T>
       & Dependency<'multiply', [T, T], T>
       & Dependency<'subtract', [T, T], T>
       & Dependency<'conj', [T], T>
   ) : Implementation<'multiply', [Complex<T>, Complex<T>], Complex<T>> =>
      (w, z) => {
         const mult = dep.multiply
         const realpart = dep.subtract(mult(w.re, z.re), mult(dep.conj(w.im), z.im))
         const imagpart = dep.add(mult(dep.conj(w.re), z.im), mult(w.im, z.re))
         return complex_binary(realpart, imagpart)
      }

Would that be worth thinking through in more detail?

EDIT: so that is basically just using Signature<CandidateParams, ActualParams, Returns>

gwhitney commented 1 year ago

thanks for continuing to try to brainstorm! Not sure why we are discussing the current state here, though. I just opened https://code.studioinfinity.org/glen/typocomath/issues/6 with how I see the status of things, including a new possibility I would like to raise.