Expose Special Members as Library functions

crusso commented 4 years ago

Special members like a.keys(), blob.bytes(), text.char() are not very discoverable unless you RTFM. Shall we add eponymous functions to the various libs?

If so, how do we distinguish a.keys() for mutable and immutable arrays. I'm tempted to separate Vector and and Array libs instead of naming each overloaded operation apart...

@dfinity-lab/languages thoughts?

nomeata commented 4 years ago

How useful are these special members in general, compared to only exposing then via library functions?

crusso commented 4 years ago

Notationally they are useful, but since the carriers don't implement the actual corresponding object types, you can't actually abstract over them so it's just skin deep, isn't it?

I'd prefer a more general mechanism that maps type constructors to canonical modules and lets you supply the receiver as a first argument to any function (generic or not), eg. l.map(succ) for l:List and module List, but that means hard-wiring some libraries or adding some declaration to make a library canonical.

rossberg commented 4 years ago

Yes, I think they should be duplicated in the library. Also agreeing with @crusso's that a general syntax mechanism would be nice, but it's not obvious how to design that in a modular fashion.

nomeata commented 4 years ago

If they are in the library, and we don’t have a general mechanism, then why even have the special case? Why should just some very few library function have the special status of the convenient syntax? Seems oddly non-uniform to me. And I find it frustrating as a developer when some things have special status without me being able to also define things that have the same status.

Hence my suggestion to just get rid of them (and thus all the odd overloading of DotE on non-objects etc.) completely. And only allow them back as part of a general, developer-accessible mechanism.

rossberg commented 4 years ago

I don't particularly like the special status either. I'd be fine removing it if we had a nice, non-hacky design for the sugar that @crusso suggested. Suggestions?

nomeata commented 4 years ago

Just to clarify: You do not want to remove them without sugar, and have developers use Array.len(xs)?

rossberg commented 4 years ago

Yeah, I don't know. It's just too damn convenient.

Okay, here is a half-arsed half-baked idea:

Any e.x where e has primitive type T is reinterpreted as (func (y1,...,yN) { T.x(e,y1,...,yN) }), if a module named T is in scope and hosts a function x : (T, T1,...,TN) -> U.

(In most cases this can immediately be fused with the following application, avoiding the closure.)

That at least would generalise what we have now to all primitive types and to all functions provided for them. WDYT?

nomeata commented 4 years ago

Yes, that’s what I was thinking over lunch. I’d extend it to

Any e.x where e has primitive type T<t1,…,tn> is reinterpreted as (func <s1,…,sm>(y1,...,yN) { T.x<t1,…tn,s1,…,sm>(e,y1,...,yN) }), if a module named T is in scope and hosts a function x : <t1…tn,s1…sm>(T, T1,...,TN) -> U.

to allow for composite types, and

For the purpose of this rule, ?t is called Opt<t>, [t] is called Array<t> and [var t] is called Vector<t>

(or some other names for the mutable variant.)

rossberg commented 4 years ago

Right, but shouldn't that be

Any e.x where e has primitive type T<t1,…,tn> is reinterpreted as (func <s1,…,sm>(y1,...,yN) { T.x<s1,…,sm>(e,y1,...,yN) }), if a module named T is in scope and hosts a function x : <s1…sm>(T, T1,...,TN) -> U.

Or is there any benefit in "pre-inferring" t1,...,tn?

For the purpose of this rule, ?t is called Opt, [t] is called Array and [var t] is called Vector

Opt -> Option per the current lib. Vector is immutable in other langs, so perhaps VarArray?

nomeata commented 4 years ago

Or is there any benefit in "pre-inferring" t1,...,tn?

I would expect to be able to write

xs.map<B>(f) for `xs :: [A]` and `f : A -> B`

so that needs something like the pre-inferring, right?

crusso commented 4 years ago

I would call the (mutable) array libry Array, and the immutable one Vector, which is fairly natural - most folk expect Array to be mutable.

I'm not sure about pre-inferring. What if you have a non-fully parametric. specialized function like List.assoc : <K,V>(List<(K,V)>, K) -> ?V, then x.assoc(k) wouldn't type-check correctly, would it?

I think we should just require the full instantiation or omit it and rely on inference. C#'s extension methods are liberal in this way and let you extend a generic class with specialised method instance methods, which can be useful.

So:

Any e.x where e has primitive type T<t1,…,tn> is reinterpreted as (func <s1,…,sm>(y1,...,yN) { T.x<s1,…,sm>(e,y1,...,yN) }), if a module named T is in scope and hosts a function x : <s1…sm>(T<U1,...,UK>, T1,...,TN) -> U.

nomeata commented 4 years ago

That means that if I have T.x : <T1,T2>(T<T1>, …) and I have e : T<ComplexType>, then I can't write e.x<Int>(…) (i.e. only instantiate T2), and (annoyingly) have to specify e.x<ComplexType, Int>(…), althought that is quite redundant?

Compare that with what happens with classes (which we are trying to mimic here, in a way)

class Foo<T1>() {
  public func map<T2>(f : T1 -> T2) : … = …
}

then when I have a e : Foo<ComplexType>, I can call e.map<Int>(f). Because the T1 is already fixed and determined by e.

rossberg commented 4 years ago

I suspect that currying type parameters this way would be very surprising and confusing in practice. Users see the function type and assume that it describes the type arguments it accepts. Sure, we are also currying the receiver arg, but at least that is still explicitly provided in the syntax, just in an odd way. For type arguments, they would be coming out of nowhere in a much more magic way with a much less obvious relation to the written program.

It's also not clear what it would mean if the remaining type parameters are omitted. The way our inference currently works is that you either have to provide all type arguments or none. The rule you propose would create a new case that's neither.

So I think I agree with @crusso that we should avoid those hazards and keep inference in one place. As he points out, that's also more flexible in practice.

nomeata commented 4 years ago

Ok, I rest my case. Practice will show what's more natural.

I was hoping we could explain the syntax and typing of these things by making it look as if there is a (partially magic) Option class (after all, that's how developers would create types whose values have members), but that analogy doesn't carry weight, it seems.

rossberg commented 4 years ago

I see what you mean, but that analogy wouldn't work for any function where the function's polymorphism doesn't match the type's. There might even be functions on polymorphic types that actually are monomorphic.

Maybe this is more like a type class than an OO class. :)

nomeata commented 4 years ago

I see what you mean, but that analogy wouldn't work for any function where the function's polymorphism doesn't match the type's. There might even be functions on polymorphic types that actually are monomorphic.

I see. I was (naively) assuming that functions will always be Type.foo <t1,…>(x : T<t1>,…), but if that’s not the case then yes, my idea is broken.

More general: Do we really want this kind of overloading? Will it not be confusing that xs : [Int] can be used like an object, until I try to pass it to some higher-order function that expects an actual object (with the members that seem to be there)?

If we do: Note that xs behaves a bit like an object, but only in second-class uses, where it is directly deconstructed. By that reasoning it seems justifiable that xs.len also just behave a bit like a function, but only in second-class uses, and only support this in a manifest call (xs.len()), so that we don't have to worry about generating and optimizing away the eta-expanded functions?

rossberg commented 4 years ago

Yes, I'm wondering whether this shorthand should use different syntax. But there aren't any obvious contenders (I considered -> like in C. :) ), and dot is just so damn standard and convenient, and probably avoids more surprises than the lack of real object coercion would add.

I'd be fine with restricting this to applications, but it's less compositional and hence more awkward to specify and implement (in the type checker anyway).

Btw, we could go more crazy, and move further in the direction of type classes. Imagine we are saying that e.x searches for a module that has a type component Self that is a supertype of the type of e, and looks up x there. The usual problem is what to do about overlapping matches.

nomeata commented 4 years ago

but it's less compositional and hence more awkward to specify and implement (in the type checker anyway).

True. But much easier in the desugarer, because you can just replace it with the full call to Type.x(e,…). So in that sense it’s easier to specify, as it is very syntactical (no need to explain it via an anonymous function with type parameters out of thin air).

The crazy stuff sounds fun :-)

kritzcreek commented 4 years ago

Sounds kind of related to Scala implicits... https://docs.scala-lang.org/tutorials/FAQ/finding-implicits.html

The usual problem is what to do about overlapping matches.

Also how to minimize the "scary action" at a distance effect. Reordering imports in Scala can cause your program behaviour to change.

rossberg commented 4 years ago

@kritzcreek, yes, certainly. Also, Rust's impls. One difference with my suggestion is that there is no implicit application, which is the more powerful (and much more complicated) aspect of the other mechanisms.

The reordering problem is just a consequence of the overlap problem. No overlap => no dependency on ordering.

matthewhammer commented 4 years ago

The usual problem is what to do about overlapping matches.

What if the dictionaries were partially explicit, with a delimited scope?

This isn't a solution yet, but I'm imaging some construct like defaults e1 e2 where e1 is a module with some prescribed structure (?) that defines the default implementations used within the dynamic scope of e2.

Outside of such a block (lexically), there are no implicit definitions. That means this is somewhat non-compositional across distinct lexical scopes, but paying a price like this seems inherent to the problem of resolving potential overlaps.

I imagine such solutions have been considered before in the literature, but I'm new to the past type class work.

dfinity / motoko-base

Expose Special Members as Library functions #95