keean / zenscript

A trait based language that compiles to JavaScript
MIT License
42 stars 7 forks source link

Language Discussion #1

Open keean opened 7 years ago

keean commented 7 years ago

The idea is to create a simple, compact, symmetrical, typed language for generic programming that compiles to JavaScript. The initial reference is "Elements of Programming" by Alexander Stepanov and Paul McJones.

The core concept is to have parametric types, type-classes and type-families, along with maybe higher-ranked-types, higher-kinded-types, to enable type-directed generic programming.

Inspiration is from C, C++ Concepts (type safe templates), Haskell, Rust, Eff, and others.

This is different from TypeScript, because it is based on type-classes, not classical inheritance, different from PureScript because it allows imperitivity and state.

This issue is for the initial design of the language, with other issues being forked as necessary for specific topics like syntax, semantics, type system etc.

SimonMeskens commented 7 years ago

I'd like to chime in that Elements of Programming is one of my favorite tech books, it's got a very prominent space on my book shelf, next to Luca Cardelli's A Theory of Objects and Bird and Wadler's Introduction to Functional Programming (1st Edition). If a language could express the ideas in those three books combined, it might very well be my favorite language.

What will your objects look like? Have you studied Self? I highly recommend giving this short paper a read: http://sauerbraten.org/lee/ecoop.pdf

What will the solution for this problem look like in your language? (I think it's a great litmus test for languages) It shows why just type classes has some problems solved by delegation with differential inheritance (a more powerful superset of subclassing).

keean commented 7 years ago

@SimonMeskens I'm not familiar with "A Theory of Objects" but a while back I contributed to this paper https://arxiv.org/pdf/cs/0509027.pdf which shows how object types fit into a formal type system.

With objects there is a problem with dispatch in that "a.f(b, c, d)" gives special treatment to the receiving object 'a', and only does dispatch on this. With type-classes we start with a language that has no overloading. If we want to allow overloading we use a type class, and a type class can dispatch on multiple type parameters.

I will take a look at the examples in that paper, thanks.

keean commented 7 years ago

@SimonMeskens I have implemented an example from the Shark paper in Rust, as it is a similar language (imperative and has traits):

trait Animal {
    fn swim_away(&self) {
        println!("swim away");
    }
}

trait Encounter<A> : Animal where A : Animal {
    fn encounter(&mut self, other : &A);
}

struct Fish {}
struct Shark {
    healthy : bool
}

impl Shark {
    fn swallow<A>(&self, _ : &A) where A : Animal {
        println!("swallow");
    }
    fn fight<A>(&mut self, _ : &A) where A : Animal {
        self.healthy = false;
    }
}

impl Animal for Fish {}
impl Animal for Shark {}

impl Encounter<Fish> for Fish {
    fn encounter(&mut self, _ : &Fish) {}
}

impl Encounter<Shark> for Fish {
    fn encounter(&mut self, other : &Shark) where Self : Animal {
        if other.healthy {
             self.swim_away()
        }
    }
}

impl Encounter<Shark> for Shark {
    fn encounter(&mut self, other : &Shark) where Self : Animal {
        if self.healthy {
            self.fight(other);
        } else {
            self.swim_away();
        }
    }
}

impl Encounter<Fish> for Shark {
    fn encounter(&mut self, other : &Fish) where Self : Animal {
        if self.healthy {
            self.swallow(other)
        } else {
            self.swim_away();
        }
    }
}

I don't intend to use the same syntax as Rust, but it gives an idea of the structure in a language with type-classes. One key difference with JS as the target is that we have garbage collection, so we don't need to have any lifetimes, which simplifies things (although Rust correctly infers all the lifetimes in this example without any annotation), and also because of this we don't need affine types, and JS always passes by value (objects are values which are references to the object).

keean commented 7 years ago

First design questions are how much type inference should there be, and what are the consequences for syntax. Considering the simplest polymorphic function, in a Rust-like style:

fn id<A>(x : A) -> A { x }

Or Haskell style:

id :: A -> A
id x = x

The Haskell style has the advantage of working with type inference, and the Rust style has the advantage of automatically introducing scoped type variables.

Obviously there will need to be type signatures in type-class definitions, and records / module definitions.

SimonMeskens commented 7 years ago

I think that if the goal of the language is to provide a very simple, but very expressive language, you should consider going for a very simple, but expressive syntax. In that case, I think I like the Haskell syntax more than the Rust syntax. I've never been a fan of how dense the Haskell syntax gets though, so I'd aim to sit somewhere in between. I think Haskell syntax, with words instead of symbols, would work best. If you want, I could give some samples of syntax that might be fun to play around with. I think I'm going to get out Elements of Programming and look up a few good examples and try to create a syntax that is made explicitly to express those problems.

Thanks for providing the example in Rust. I think there's a possibility to introduce some concepts to make it more powerful and concise. I assume you'd prefer me to open an issue with a proposal of any specific features? Btw, in the paper I linked, multiple dispatch is more or less equal to type classes, so, as the paper shows, introducing a concept of delegation improves upon the code semantics. I've been playing around with the idea of delegation in Haskell and I think I know where to go with it. I'll move that talk to the other issue then.

shelby3 commented 7 years ago

@keean the type signatures of all exported interfaces must be explicitly declared and not inferred, else modularity (separate compilation) is impossible which is critically needed if targeting JavaScript and its dynamic runtime module importing. This also makes the interaction of certain higher-order first-class type features decidable when moving away from the origin on the Lambda Cube, as well enables to implement typeclasses in more than one way (which Haskell can't do due to its global inference).

It also enables decidability with first-class anonymous (structural) unions (disjunctions), which is a mandatory feature of the language else I won't contribute. TypeScript, N4JS, and Ceylon (which emits JavaScript) have this feature. Scala 3 (Dotty) also has it (and there is a Scala.JS compiler for Scala 2). Ceylon also doesn't support typeclasses, but Scala 2 (and presumably Scala 3) does. Inferred structural unions are useful for many paradigms and is necessary to support the JavaScript || logical operator which doesn't always return a boolean.

Declaration of type signatures within functions (also methods) should be optional and inferred if not specified. Declaration of non-exported function type signatures could perhaps also be optional and inferred.

shelby3 commented 7 years ago

@keean to add to your opening comment of this issue thread, TypeScript is intentionally unsound typing and PureScript lacks first-class anonymous structural unions. N4JS appears to be sound typing, but also lacks typeclasses.

keean commented 7 years ago

@SimonMeskens I am interested to see what the code looks like with delegates. I provided the sample in Rust because I can syntax check and run the example, there is always the problem of unchecked errors when posting hypothetical code samples.

keean commented 7 years ago

@shelby3 I agree all exported interfaces must have type signatures. For higher-order features I quite like first-class-polymorphism. I think a reasonable assumption is that everything inferred is monomorphic. If non-exported function signatures can be inferred, then a syntax with optional type annotations would seem better.

keean commented 7 years ago

@shelby3 regarding first class unions, I think one of the key features is supporting the kind of programming you want to do. I do want to suggest an idea though, and see what the consequences are. The idea is to have no concrete types at all, so all types are expressed by type variables constrained by type classes. Consider the identity function using a cross between Haskell and TypeScript notation:

id(x : a) : a = x

'A' can be any type, so we could constrain it to be a type-class using a Prolog like notation:

id(x : a) : a :- Integer a = x

This is already the union of all types that implement Integer, but we can go further and have a type-level or operator:

id(x : a) : a :- Integer a | String a = x

This has a nice synergy with JavaScript and duck-typing as we are saying that we don't care what type is passed, as long as it provides the correct interface. We can type check this nominally in TraitScript, but it reflects the underlying JavaScript that allows any object to be passed as long as it provides the right methods and properties.

The question is, does that give all the functionality that you get with union-types?

keean commented 7 years ago

@shelby3 I find myself re-thinking in the morning and finding I don't agree with what I posted yesterday. An or operator for traits does not make any sense, as it is saying the object may have either method A or method B. What we want to know inside a function is if we can safely call method A for example.

I don't think there is any need for an 'or' operator, or first class unions outside of dynamic types (aka trait-objects). I think I need to revisit the original expression problem with type-classes.

Starting with a collection of trait-objects (existential types):

c : [a] :- Integer(a) = [new A(), new B()]

Requires all objects inserted into 'c' to implement the Integer trait. Now lets say we want to print this as well, we somehow need to inject the 'Show(a)' trait into 'c'. This is actually an 'and' operation, but it occurs in an 'open' way, somewhere else in the source code, something like:

show(c : Coll) {
    foreach(c, a => {
        print a;
    })
}

negate(c : Coll) {
    foreach(c, a => {
        a = -a
    })
}

This requires anything calling 'show' has to be able to prove all elements in collection 'c' provide show. This to me suggests that what is needed is some kind of constraint inference. We can easily infer the set of constraints on each element of 'c' as being (Show a, Negate a) or something like that. The question is whether 'c' is heterogeneous or homogeneous, there are two possible type signatures:

data Coll = Coll [a]
data Coll = forall a . (Show a, Negate a) => Coll [a]

In the former case 'show' and 'negate' would need to each have the constraints on '[a]' in their type signatures directly (although they can be inferred). One option would be to allow:

data Coll = forall a => Coll [a]

To automatically accumulate the inferred type classes. But what to do on module boundaries?

shelby3 commented 7 years ago

@keean I explained to N4JS's @yoshiba why Joose and Object Algebras do not solve Wadler's Expression Problem and are not as orthogonally extensible as typeclasses.

shelby3 commented 7 years ago

@keean wrote:

@SimonMeskens wrote:

What will your objects look like? Have you studied Self? I highly recommend giving this short paper a read: http://sauerbraten.org/lee/ecoop.pdf

With objects there is a problem with dispatch in that "a.f(b, c, d)" gives special treatment to the receiving object 'a', and only does dispatch on this. With type-classes we start with a language that has no overloading. If we want to allow overloading we use a type class, and a type class can dispatch on multiple type parameters.

Also remember we discussed in May, that typeclasses (not virtualized typeclasses, i.e. Rust's trait objects) can be monomorphic at the function application use-site because the data type is not lost (as in subclassing where it is assigned to a supertype), thus avoiding the virtual method jump table lookup which otherwise stalls the CPU pipeline reducing performance by as much as to 5 - 6 times slower.

keean commented 7 years ago

@shelby3 I have created a second issue here https://github.com/keean/traitscript/issues/2 to discuss union types.

I am fully aware of monomorphisation, which is when the type is statically determined. There is also runtime polymorphism (which it what haskell uses existential types for, and Rust uses Trait objects. I prefer the existential approach for the type system, as it is actually sound as a logic).

keean commented 7 years ago

Also note the mult-method dispatch version of the Sharks code makes some other changes, introducing new types for HealthyShark and DyingShark, I will update the Rust code to reflect this way of implementing it.

shelby3 commented 7 years ago

@keean wrote:

JS always passes by value (objects are values which are references to the object).

ASM.js passes by value unboxed. For functions operating only on numbers, afaics we could support this seamlessly. Emulating other types (e.g. strings) in ArrayBuffers would require more complexity to compile and interopt, and might require more C-like syntax. We could keep in mind these sort of optimizations and whether we want and could reasonably provide this without forcing the programmer to resort to a FFI.

No need to discuss this ASM.js tangent right now.

SimonMeskens commented 7 years ago

I think you can write delegation as a monadic structure, which might be the real ticket. I also look forward to your new implementation. I think it's a cool example as it covers such a broad range of language capabilities.

shelby3 commented 7 years ago

@SimonMeskens wrote:

Thanks for providing the example in Rust. I think there's a possibility to introduce some concepts to make it more powerful and concise. I assume you'd prefer me to open an issue with a proposal of any specific features? Btw, in the paper I linked, multiple dispatch is more or less equal to type classes, so, as the paper shows,

The huge salient difference is that multi-methods is dynamically dispatching to the implementation of the subclasses of Animal at run-time, thus stalling the CPU pipeline and adversely impacting performance.

Wheres, the Rust type-class example shows that type-classes can be resolved monomorphically at compile-time, so there is no jump table lookup.

Edit: another huge difference is that type-classes don't bind the multiple interfaces in the class inheritance and instead keep interfaces and classes orthogonal.

introducing a concept of delegation improves upon the code semantics. I've been playing around with the idea of delegation in Haskell and I think I know where to go with it. I'll move that talk to the other issue then.

I think you can write delegation as a monadic structure, which might be the real ticket. I also look forward to your new implementation. I think it's a cool example as it covers such a broad range of language capabilities.

@keean wrote:

Also note the mult-method dispatch version of the Sharks code makes some other changes, introducing new types for HealthyShark and DyingShark, I will update the Rust code to reflect this way of implementing it.

Changing the interface based on the state of data type, conflates the instance of the data type with the interface. This makes orthogonal extension with for example collections of instances inefficient and boilerplate prone.

I don't think this is a paradigm that should be used.

P.S. if we do our job exceedingly well (as in demonstrate that type-classes and sound typing kick ass on subclassing), there is a remote possibility we could be designing the future replacement for EMCAScript. So performance should be high on our list of priorities, because performance is very difficult to achieve in a dynamically typed language and virtual inheritance of subclassing messes up performance. Google is looking for inspiration for their SoundScript concept of a soundly typed JS-like language in the browser. The guy leading that effort (Andreas Rossberg) is familiar with Scala, ML, etc.. Andreas commented that N4JS is "very interesting".

But I don't know if fast and incremental compilation will be achieved in the language we are contemplating. We can think about this in our design decisions if we want to.

keean commented 7 years ago

SoundScript is described here: http://www.2ality.com/2015/02/soundscript.html which sounds like JavaScript plus enforced classical objects and inheritance, and simple function typing. There's nothing about generics or code generation.

SimonMeskens commented 7 years ago

SoundScript's goal is to actually be somewhat compatible with TypeScript, I assume Google intends to make it so we don't have to compile TypeScript for Chrome anymore.

shelby3 commented 7 years ago

@keean wrote:

which sounds like JavaScript plus enforced classical objects and inheritance

Until they figure out that virtual inheritance stalls the CPU pipeline and type-classes don't. I think we have a very tiny (almost nil) chance to influence the future. Any way, we should design for performance assuming one day our language will have its own VM and not just compiled to JS, if that doesn't incur any huge trade-offs in our design constraints. Also we can think about optimizing some parts with ASM.js. I am not sure how this thought process will play out. I am trying to keep my mind open and see where the design constraints lead us. I agree the priority is on having a simple, clean language without proliferation of duplicated paradigms.

Edit: battery life (i.e. performance) on mobile is important. I could see our language potentially becoming very popular for mobile apps.

There's nothing about generics or code generation.

In the link I provided, they mentioned TypeScript as a possible starting point, but aiming to make it sound. I think they will (or maybe already have) discovered that retaining interoperability with TypeScript code while also making TypeScript sound is probably not practical. Of course I could be wrong about this.

shelby3 commented 7 years ago

From now on, in all discussion where I mean to use type-classes, I will use the word trait to indicate the interface of the type-class, since that is the name Rust chose. We might choose a different keyword or not, yet until then I will use the term trait. And not the same thing as trait objects.

@keean wrote:

I agree all exported interfaces must have type signatures. For higher-order features I quite like first-class-polymorphism.

So I assume by "first-class-polymorphism", you mean we can declare a (function, method, or class) type parameter constrain to a trait (or union of traits) bound, and then I can call functions and methods without knowing the actual type of the data which implements the trait(s)?

I presume by "first-class-polymorphism", you don't mean higher-ranked types.

I think a reasonable assumption is that everything inferred is monomorphic.

Unpacking this, I presume you mean that inference will never subsume to a subclassing subtype (with virtual inheritance) nor to a virtualized trait object (which also incurs dynamic dispatch and stalls the CPU pipeline)? And this is why we need anonymous structural union types, e.g.:

if (x) true
else 3

The inferred type is Boolean | Number.

I am presuming we will choose an "everything is an expression" grammar. Note that the following expression (not the return type of the containing function) has a type of undefined (or the bottom type):

if (true) return 1

If non-exported function signatures can be inferred, then a syntax with optional type annotations would seem better.

Agreed. Thus we need the suffixed : type and not the C/Java prefixed type declaration.

shelby3 commented 7 years ago

@SimonMeskens wrote:

I think that if the goal of the language is to provide a very simple, but very expressive language, you should consider going for a very simple, but expressive syntax. In that case, I think I like the Haskell syntax more than the Rust syntax. I've never been a fan of how dense the Haskell syntax gets though, so I'd aim to sit somewhere in between. I think Haskell syntax, with words instead of symbols, would work best.

I think I'd prefer to stay as close to JavaScript as possible in expression syntax while everything becomes an expression (optionally a statement as well). The point is to produce a language that is very easy for everyone to learn.

However, I hate noisy code (unless necessary to provide reasonable cognitive load), so I am open to adopting some function level grammar ideas from Haskell, such as:

I prefer consistency of coding style because we are in the open source reality now, where code should be readable the same to everyone. So I would prefer to enforce a chosen style.

I realize we might not agree on all of these. Let's discuss. But syntax may not be the most important discussion to have now and first.

SimonMeskens commented 7 years ago

I agree with all of those list items :+1:

shelby3 commented 7 years ago

@SimonMeskens cool we finally agree on something :heart_eyes:

SimonMeskens commented 7 years ago

Yeah, sorry if I acted like an ass on the TypeScript place, not sure why I get defensive there, I'm a pretty nice guy usually. I also don't have personal beef with you, life is far too short for that :smile:

keean commented 7 years ago

I agree with most of that, with some restrictions.

Couple of neat examples.

Pattern matching in arguments:

append x Nil = x
append x (Cons h t) = Cons h (append x t)

Guards on arguments:

find_zero Nil = false
find_zero (Cons h t) | h == 0 = true
find_zero (Cons h t) | h /= 0 = find_zero t
shelby3 commented 7 years ago

@keean regarding patterns and guards on arguments, if we are targeting JavaScript, then I am thinking we should try to have a close correspondence between our source code and the emitted JavaScript. I think this is a reasonably high priority goal, given one is testing in the browser in the wild and doesn't have source maps.

I agree your examples do provide a slight reduction in verbosity, but...

Mucking with the way functions are declared will cause cognitive dissonance for programmers who only know C/C++/Java/JavaScript languages. The first time I saw pattern matching on function argument declarations, it looked like gobbledygook. You want to make this a popular language if possible? If I go adopting this language for a mobile apps platform ecosystem I have in mind, then I may receive complaints about the programming language being too weird/obtuse. Also the pattern matching syntax looks even noisier if we match the way functions are declared with the way they are applied with mandatory parenthesis:

append<A>(x: A, list: Nil): A = x
append<A>(x: A, List<A>(head, tail)): List<A> = new List(head, append(x, tail))

Alternative:

append<A>(x: A, List(head: A, tail)): List<A> = new List(head, append(x, tail))

Compare to:

append<A>(x: A, list: List<A>): List<A> = new List(list.head, append(x, list.tail))

Is it really worth complicating it for newbies and adding another paradigm?

Alternatively we could add where but it adds another keyword (and paradigm):

append<A>(x: A, list: List<A>): List<A> = new List(head, append(x, tail))
  where head = list.head, tail = list.tail

But that is unnecessary because if the function will be multiple LOC, then we can just reuse val head = list.head instead (val being same as JavaScript const):

append<A>(x: A, list: List<A>): List<A> =
  val head = list.head, tail = list.tail
  new List(head, append(x, tail))

I think I prefer let instead of val and var instead of let (but the emitted transposed JavaScript may be confusing). JavaScript really mucked this up badly because they needed to alter the semantics of mutable var and so needed a new keyword for mutables and they choose let to represent a variable but let should be immutable. And these keywords are referring to mutability of reassignment and not to whether what they reference is immutable.

Edit: also patterned guards are going to create name mangling in the emitted JS code if you are resolving these statically.

P.S. note I prefer to write that with an inferred return (result) type:

append<A>(x: A, list: List<A>) = new List(list.head, append(x, list.tail))
shelby3 commented 7 years ago

@keean wrote:

function name overloading, I would want all overloads to have the same type, so the overloads are on data constructors.

function overloading on constructors implies pattern matching in function arguments, so I think these two are mutually exclusive. You cannot have both. I would rather have pattern matching and overloading in arguments as it leads to less indentation.

Can you unpack that? I don't understand.

c : [a] :- Integer(a)

Gobbledygook to my eyes. What does that mean?

Looks like a pattern similar to c > [a] >= Integer(a).

Please no symbol soup :-. Is that a smiley face without the smile.

keean commented 7 years ago

In the append case both arguments are lists.

It's not just the reduction in verbosity, but the lack of indenting, keeping everything lined up to the left makes for more readable code. I am not concerned about keeping close to JS, especially if native compilation or web-assembly are future targets. However I do think you have a point about people not used to the style.

If you don't have pattern match in arguments, then you can only have one function declaration for each function, no ad-hoc overloading, because you need to use a type-class to overload different argument types.

I think let for immutable assignment, and I wonder is mutable assignment is necessary, as objects are a reference you can mutate the object referenced, even if the reference itself is immutable. Perhaps we could start with only single immutable assignment and see how it goes?

keean commented 7 years ago

Overloading on function type requires a type class, so what other kind of overloading is there? You can overload on constructor patterns (like my example with append), but it sounds like we don't want that.

As for:

c : [a] :- Integer(a)

It says 'c' is a list of elements of type 'a' where 'a' is constrained to implement the type-class 'Integer'.

You would write this in Haskell:

c :: Integer a => [a]

I am not sure you could write this in Rust, except as part of a function declaration:

fn f<a>(c : List<a>) where a : Integer

However I have realised that this is using a concrete Lust type, which is not generic. We should really use a generic collection interface like this:

 fn f<a, b>(c : a) where a : Collection<b>, b : Integer

But we want to be able to type variables generically not just functions. My syntax for this would be:

c : a :- Collection(a, b), Integer(b) = ...

':-' is the Prolog inference operator.

Obviously we need a syntax that is easy to understand.

shelby3 commented 7 years ago

@keean wrote:

In the append case both arguments are lists.

The cognitive load is too high. Now I see it (below), because x is result of append so x is inferred to be a list. But it wasn't explicitly declared. Sorry most programmers don't think this way (yet). It will be a massive cognitive dissonance (even though it is second nature for you and you don't even have to think about it).

append x Nil = x
append x (Cons h t) = Cons h (append x t)

I will quote from the author of Learn You A Haskell:

I failed to learn Haskell approximately 2 times before finally grasping it because it all just seemed too weird to me and I didn't get it.

My experience was the same. It wasn't until I understood Scala and some other functional languages that Haskell clicked for me. And Monads, Applicatives, and Kleisi Arrows still give me headaches.


@keean wrote:

It's not just the reduction in verbosity, but the lack of indenting, keeping everything lined up to the left makes for more readable code.

The examples I provided are also left-justified. I don't understand.

However I do think you have a point about people not used to the style.

Thanks. If anyone can convince me that is not a priority, then please do.

I am not concerned about keeping close to JS, especially if native compilation or web-assembly are future targets.

I am! It will be our main target for a long time probably. Other targets need time to mature. JS target is free, batteries included¹. I quote ESR and his 166 IQ:

Too clever by half lands you in incident pits.

How do you avoid these? By designing to avoid failure modes. This why “KISS” – Keep It Simple, Stupid” is an engineering maxim. Buy the RTC to foreclose the failure modes of not having one. Choose a small-form-factor system your buddy Phil the expert hardware troubleshooter is already using rather than novel hardware neither of you knows the ins and outs of.

Don’t get cute. Well, not unless your actual objective is to get cute – if I didn’t know that playfulness and deliberately pushing the envelope has its place I’d be a piss-poor hacker. But if you’re trying to bring up a production mailserver, or a production anything, cute is not the goal and you shouldn’t let your ego suck you into trying for the cleverest possible maneuver.

¹ Meaning it is a widely distributed VM with a huge installed user base.


@keean wrote:

If you don't have pattern match in arguments, then you can only have one function declaration for each function, no ad-hoc overloading, because you need to use a type-class to overload different argument types.

Okay I see your point. You are assuming we will never allow functions that input concrete data types, only trait bounded type parameters (per that idea you mentioned upthread). Then a data type might be implemented for traits declared by two or more of the overloaded variants, so at the use-site it would require a cast in order to select which overloaded function in that case of ambiguity. But this doesn't mean we couldn't have function overloads, rather that ambiguities create verbosity of casts. And I don't see how patterns and guards changes that since they share the same function name. Are you proposing they won't be resolved statically?

Also functions can be overloaded by arity of arguments, not just type.

I think let for immutable assignment, and I wonder is mutable assignment is necessary, as objects are a reference you can mutate the object referenced, even if the reference itself is immutable. Perhaps we could start with only single immutable assignment and see how it goes?

How do I implement a for loop if i is not mutable?

keean commented 7 years ago

I don't think you have quite understood the overloading. A function has a type... it can only have one type, that is fundamental, otherwise you cannot know the type of the function :-)

So how do we allow function overloading, with type classes for example (in Haskell):

class Show a where
    show :: a -> IO ()

instance Show Int where
    show x = int_to_string x

instance Show Float where
    show x = float_to_string x

So show is overloaded on the type of its argument. That's what type-classes do, they allow type-safe overloading.

None of this allows functions with different arity, because that just cannot be typed. Neither Haskell nor Rust allow functions different arity even using type-classes or traits.

However you can simulate this (and Pythons named optional arguments) by passing a Record as the argument.

With for loops, we can see the loop variable as immutable single assignment inside the loop. We can see the loop body as a function, and 'i' is its single argument.

One idea I like is that the notation for an anonymous function is simply '{' ... '}'. Now consider:

for x in range(1, 4) {print x}

This is really just syntactic sugar for:

range(1, 4).foreach(function(x) {print x})
shelby3 commented 7 years ago

@keean wrote:

A function has a type... it can only have one type, that is fundamental, otherwise you cannot know the type of the function :-)

You are I guess only thinking of the way it is in Haskell (and Rust), but there is another way which I've seen in other languages such as method overloading in Java. Each duplicate function name is a different function.

Some people claim that form of overloading interacts very badly with inference. And other corner cases.

Let's not get too verbose here. If there is any disagreement on this point, please go to Rust's forum and private message me. We can resolve there instead of cluttering this thread with many back and forth.

shelby3 commented 7 years ago

@keean I am not 100% closed-minded to patterns in arguments. They are a little bit weird but not catastrophically so if put them in the more familiar C-style function declaration that I showed by example.

IMO, guards on arguments are far too noisy and make the function declaration unreadable for many of us. It is better to use a switch or match expression within the function to break up the orthogonal concepts of typing and specifying the arity and names of the arguments, versus run-time guarding the code paths. Guards aren't statically resolved (dispatched) correct?

keean commented 7 years ago

@shelby3 wrote:

You are only thinking of the way it is in Haskell, but there is another way which I've seen in other languages such as Java. Each duplicate function name is a different function.

It certainly messes with type inference if you do this, and breaks local reasoning, because new versions of a function may change program behaviour, especially combined with generics.

shelby3 commented 7 years ago

@keean wrote:

@shelby3 wrote:

You are I guess only thinking of the way it is in Haskell (and Rust), but there is another way which I've seen in other languages such as method overloading in Java. Each duplicate function name is a different function.

Some people claim that form of overloading interacts very badly with inference. And other corner cases.

It certainly messes with type inference if you do this, and breaks local reasoning, because new versions of a function may change program behaviour, especially combined with generics.

I agree, it is a can-of-worms. I hadn't thought it through entirely, especially on the point of changing program behavior far from the declaration site. Yikes! :fearful: TypeScript allows this apparently.

Hmmm. But how does it differ from a function which inputs a union of traits, e.g. Write | Show? Isn't adding another function overload equivalent to adding another item to the union for an argument of a single function? Are you claiming we can't write a function that inputs a union of trait interfaces? It seems incorrect to force the programmer to choose another boilerplate name (e.g. outputWrite and outputShow) for a function that is doing the same operation, just to differentiate the implementation of that function for two different input types. However, I agree that when multiple arguments per function are interacting in this matrix then it gets quite complex to determine which function will be chosen by in(ter)ference.

So I lean towards disallowing Java-style function overloading, but allowing unions and intersections for argument types.

Multiple arity can usually be handled with default arguments. And out-of-order defaults can be handled with named arguments on application, e.g. apply(arg6 = value).

The above is what N4JS has chosen (except I don't know if they implemented named arguments).

keean commented 7 years ago

With traits you would force the user to write a trait (type class) for a function that accepts different input types. You don't necessarily get rid of the behaviour changes, but you limit where it can occur to a trait, and it's only a problem if you allow specialisation. You can use some heuristic rules to make sure trait specialisation is localised to one file like Rust does.

To get named arguments and variable arguments, you can use a record (like a struct) with default values. You create an instance of the struct, and have to provide values for all elements of the struct that don't have defaults. You can then have a function that accepts the struct as its argument. With a bit of syntax sugar this can be made to look like Python named (optional) arguments.

shelby3 commented 7 years ago

I guess whether guards on arguments (including any guard to access the instance of union) are statically or run-time dispatched is a compiler implementation detail? In other words if the guard is the first operation in the function, then in some cases (i.e. it is statically type matched) it can be replaced with static calls directly to the leaves of the guard, rather than run-time jumps which stall the CPU pipeline (twice, once for the call on function application and again to jump to the code for leaf of the matched guard).

keean commented 7 years ago

I think guards are just 'nice' syntax. I don't think there is any difference between:

find_zero Nil = false
find_zero (Cons h t) | h == 0 = true
find_zero (Cons h t) | h /= 0 = find_zero t

is really the same as (with Python like syntax):

find_zero(x):
    if x == Nil:
        false
    else if x.head == 0:
        true
    else:
        find_zero(x.tail)

Note: there are problems with this approach, what happens if you call x.head when x == Nil ? With pattern matching this cannot happen.

shelby3 commented 7 years ago

@keean wrote:

With traits you would force the user to write a trait (type class) for a function that accepts different input types.

I don't like this. Unnecessary boilerplate to create a dummy nominal trait which is the union of two other traits. Programmer would prefer to write the anonymous structural union instead, e.g. Write | Show. You seem to be trying to find a way to not have structural unions. But the alternative is lots of boilerplate and silly names such as trait WriteOrShow. How would you even make such a trait?

shelby3 commented 7 years ago

@keean wrote:

To get named arguments and variable arguments, you can use a record (like a struct) with default values. You create an instance of the stuct, and have to provide values for all elements of the struct that don't have defaults. You can then have a function that accepts the struct as its argument. With a bit of syntax sugar this can be made to look like Python named (optional) arguments.

Why add all this boilerplate and run-time overhead? We have compilers is to remove such inefficiencies.

Because you want to unifying concepts?

keean commented 7 years ago

@shelby3 wrote:

I don't like this. Unnecessary boilerplate to create a dummy nominal trait which is the union of two other traits. Programmer would prefer to write tthe anonymous structural union instead, e.g. Write | Show. You seem to be trying to find a way to not have structural unions. But the alternative is lots of boilerplate and silly names such as trait WriteOrShow.

I think you are missing the point, this has nothing to do with WriteOrShow (and I don't think such traits make any sense, you never want 'WriteOrShow' because you don't know what methods are valid.). This was about allowing ad-hoc multi-methods. To allow 'show' to be applied to multiple different types.

If the Write trait implements 'write' and the Show trait implements 'show', if I have an 'x' such that I know it implements Write or Show, which function can I call on it, can I do "x.write" or can I do "x.show"... actually either of these can fail, which makes the program unsound. If we want a sound type system we cannot allow this.

keean commented 7 years ago

@shelby3 wrote:

Why add all this boilerplate and run-time overhead? Wwe have compilers is to remove such inefficiencies. I don't understand, what boilerplate, and what run-time overhead. All this is going to get removed by the compiler because JS does not support named arguments.

I am talking about how to 'type' such things, what the syntax is, and what the emitted JS code is not affected by how the type-system works.

shelby3 commented 7 years ago

@keean wrote:

If the Write trait implements write and the Show trait implements Show, if I have an x such that I know it implements Write or Show, which function can I call on it, can if do x.write or can I do x.show... actually either of these can fail, which makes the program unsound. If we want a sound type system we cannot allow this.

That is why we are forced by the compiler to guard the two cases with a match (or if-else) within the function. We can't call any method on x until we guard it.

Please use backticks ` as above.

shelby3 commented 7 years ago

@keean wrote:

@shelby3 wrote:

Why add all this boilerplate and run-time overhead? We have compilers is to remove such inefficiencies.

Because you want to unifying concepts?

I don't understand, what boilerplate, and what run-time overhead. All this is going to get removed by the compiler because JS does not support named arguments.

I am talking about how to 'type' such things, what the syntax is, and what the emitted JS code is not affected by how the type-system works.

Okay I understand now that you just want to unify concepts in the type system.

You can check the edit times on my comments to see I completed my edits before you posted your reply. You are picking up my comments a bit too fast, before I finish editing them.

keean commented 7 years ago

@shelby3 wrote:

That is why we are forced by the compiler to guard the two cases with a match within the function.

This is a disjoint or named (tagged) union, where you can match on the constructors. Like this:

data X = S String | I Int

match x:
    S x => // x is a String
    I x =>  // x is an Int
shelby3 commented 7 years ago

@keean wrote:

This is a disjoint or named (tagged) union, where you can match on the constructors.

It can be tag-less if it is resolved at compile-time.

keean commented 7 years ago

That's what traits are for :-), they provide compile time type-case functionality. Disjoint (tagged) sum types provide runtime type-case.

shelby3 commented 7 years ago

@keean there is one case where I want to cast my input to Write or Show to select the trait that is selected at compile-time. It is just a way of naming the function with a cast instead of putting it in the function name. And it allows one function to support many casted variants on many arguments, so don't have permutation blowup in boilerplate function names, e.g. doitShowCursorGroup, doitWriteSegmentSet, etc.. For example, without my idea then 3 arguments with 3 items per union type each, would require 27 function names and declarations instead of just one.