mrakgr / The-Spiral-Language

Functional language with intensional polymorphism and first-class staging.
Mozilla Public License 2.0
916 stars 27 forks source link

Definitions for key terms #13

Closed typesanitizer closed 5 years ago

typesanitizer commented 5 years ago

It would be nice if the Readme could link to (externally or in an appendix) definitions of the key terms being used.

Intensional polymorphism

You mentioned on Reddit recently that Spiral's is quite different from proposals in existing papers on this topic. This makes it harder to understand what you're actually talking about, as the reader cannot look up a concrete definition. A small, focused example along the lines of - type A is intensionally something, type B is intensionally something else, and here we wrote a function that is polymorphic (works for either A and B) and here is why you can't write a similar function in your typical FP language.

If I look at this paper, the very first line says

Intensional polymorphism, the ability to dispatch to different routines based on types at run time [..]

OTOH, in the Readme

If statements in Spiral default to evaluating only one branch if their conditional is known at compile time meaning they are static by default. This is the bedrock of its intensional (structural) polymorphism

There seems to be a significant difference here in run time vs compile time.

First-class staging

I understand a little bit about what staging is from reading about MetaOCaml.

  1. What does it mean for staging to be first-class? Can you pass around stages to functions like ordinary values?

  2. Furthermore, the way I understand staging is -- you have code being generated in several stages, where some of the stages may be during runtime. OTOH, it doesn't seem like Spiral does code generation at runtime, instead it tries to inline almost everything and then does code gen. Is that a fair characterization or am I missing something?

More generally

You wrote on Reddit that it would be much easier for other languages to borrow ideas from Spiral given the amount of time and effort you have put in developing them. I feel that one of the problems with that is -- when I'm going through your Readme, I barely understand what you're talking about, despite trying my best to look up papers/definitions/examples elsewhere when I don't understand something. Having a small glossary of definitions + small isolated examples (which address exactly 1 language feature at a time) would be immensely helpful.

mrakgr commented 5 years ago

I'll make a bigger reply tomorrow covering the rest of the points, right now I am tired after a day of programming so I need my rest.

Let me just make a comment on intensional polymorphism which does dispacting on types on runtime as per paper you've linked to. I saw such papers while I was doing my research and basically the reason why you've never seen a real language with intensional polymorphism is because the kind of typing this spearheads leads to great inefficiencies at runtime. Having to pass around type information at runtime is essentially what dynamic languages do and misses the point of why one would want an intensional type system in the first place.

In my view intensional type system only start to make sense when you combine them with partial evaluation/inlining/specialization - the 3 terms are similar enough that they could be fused into a single one without loss of understanding.

Spiral intensional typing comprises of 3 main features of Spiral which are:

Without any of those 3 the language would break down and be useless.

Hence, I would go a bit further and claim that whatever the definition of intensional polymorphism is in that paper should be phased out of usage as it is useless. In that sense the current state of things is really convenient as nobody at this point in time knows what it is anyway so potential users and curious onlookers of Spiral will not have to waste time unlearning whatever definition is in that paper.

Strictly speaking, intensional polymorphism only relates to typing so you could have IP without joint points and inlineables, but a lot of good would come out if the association to the those two features gets made. They are two fundamental building blocks that barely anyone is aware even exist.

typesanitizer commented 5 years ago

partial evaluation/inlining/specialization - the 3 terms are similar enough that they could be fused into a single one without loss of understanding

IMO, the three terms are fairly distinct. You can have a system which just allows inlining (e.g. most Haskell code), both inlining and specialization (e.g. Rust) or something with all three.

Spiral intensional typing comprises of 3 main features of Spiral which are:

  • precise tracking of types at compile time
  • first class functions with inlining guarantees
  • join points
  1. Arguably any statically typed language worth its salt provides "precise tracking of types at compile time". Of course, one can argue of how precise it is (e.g. most languages don't track effects). What I'm trying to point out here is that the work "precise" is insufficient to convey what you're trying to convey :stuck_out_tongue:.
  2. I don't understand how you can provide inlining guarantees for arbitrary recursive functions which could take an input at run time? For most languages, you have an INLINE pragma or equivalent which works whenever it makes sense (i.e. you don't have a cycle in your call graph where all functions are marked INLINE).

More generally, I fail to see why inlining should be at all related to propagation of type information. Your partial evaluator can still make function calls -- no need to inline the body and cause code bloat.

Hence, I would go a bit further and claim that whatever the definition of intensional polymorphism is in that paper should be phased out of usage as it is useless. In that sense the current state of things is really convenient as nobody at this point in time knows what it is anyway so potential users and curious onlookers of Spiral will not have to waste time unlearning whatever definition is in that paper.

I think you're misunderstanding the paper. The definition is in the very first line of the abstract, which most likely means that it is the accepted definition in the PL community. For example, Sixten has something similar to (but not exactly) intensional polymorphism, hence it uses a distinct term "type representation polymorphism". Perhaps it would be less confusing if you used an alternate term?

Furthermore, the paper actually discusses intensional polymorphism in the presence of type erasure. From page 2 (emphasis mine)

In this paper we propose a typed calculus, called lambda_R, that ameliorates the problems of type passing without sacrificing intensional type analysis. If run-time type dispatch is to be supported, then clearly on some level types must be passed. The fundamental idea behind our approach is to construct and pass terms that represent types instead of the types themselves. The connection between a type tau and its term representation e is made in the static semantics by assigning e the special type R(tau). Semantically, we may interpret R(tau) as a singleton type that contains only the representation of tau.

More generally, I fail to see how you can avoid passing types (or terms that represent types) at run time in Spiral if you truly have first class types. Consider a function like the following (I don't properly know your syntax, consider this an approximation):

// foo : int -> [type]
let foo 0 = []
let foo n = Cons (if n % 2 == 0 then int else string) (foo (n - 1))
inl stringify = function | _: int -> "int" | _ : string -> "string"
dyn 10 |> foo |> map stringify |> print

Overall, I still don't understand what your definition of "intensional typing" or "intensional polymorphism" looks like. It would be nice if you could write one down when you have some time to spare. :smile:

mrakgr commented 5 years ago

Arguably any statically typed language worth its salt provides "precise tracking of types at compile time". Of course, one can argue of how precise it is (e.g. most languages don't track effects). What I'm trying to point out here is that the work "precise" is insufficient to convey what you're trying to convey.

Consider something like let f g x = g x in F#. When passing function g as an argument that function will have to be converted into closure which is a heap allocated object. This is unless f is inlined. Alternatively f could be specialized to g meaning that g itself is inlined. Spiral just has that as part of is language semantics. If you want to specialize f to g you would use a join point. Alternatively it is easy to inline f. Inlining f would be easy to do even in F# by writing let inline f g x = g x.

The difference between Spiral and F# is that Spiral would take that as a promise even if f is passed as an argument into a different function. This is what is meant precise tracking of types. Whether f is inlined depends on f itself.

// foo : int -> [type]
let foo 0 = []
let foo n = Cons (if n % 2 == 0 then int else string) (foo (n - 1))
inl stringify = function | _: int -> "int" | _ : string -> "string"
dyn 10 |> foo |> map stringify |> print

Here is how this could be rewritten in Spiral.

inl rec foo n = 
    assert (lit_is n) "n must be known at compile time."
    assert (n >= 0) "n must greater or equal to zero."
    if n = 0 then ()
    else (if n % 2 = 0 then int32 else string) :: foo (n - 1)

inl stringify = function _: int32 -> "int" | _ : string -> "string"
foo 10 |> Tuple.map stringify |> Console.writeline

When compiled this results in the following F# code (plus a tiny bit of boilerplate which I've ommitted).

System.Console.WriteLine("[int, string, int, string, int, string, int, string, int, string]")

You can't actually push n to runtime (using dyn) and have this work like you've done in your own example. For dealing with types at runtime one would use union types. Spiral has them for that reason much like other ML-styled languages.

What I am about to describe is something Spiral cannot do, but it might be possible to specialize at runtime in a Spiral-like language with a JIT. It would just be another kind of join point that calls the compiler and passes it the correct literals. It would be hard to do and I do not see much point in expending effort on this given that there is so much existing languages leave on the table as it is. It would require a platform accomodating to this kind of thing. .NET's JIT is a black box and it is not suited for this kind of thing. .NET is hard to deal with in general. It is a bureaucratic nightmare and I was rather disappointed that I had to resort to macros for .NET interop. I expected that with C++, but not .NET.

I don't understand how you can provide inlining guarantees for arbitrary recursive functions which could take an input at run time? For most languages, you have an INLINE pragma or equivalent which works whenever it makes sense (i.e. you don't have a cycle in your call graph where all functions are marked INLINE.

Typechecking is not done by constraint solving in Spiral which is what languages usually use for that. Rather it is inexorably fused with partial evaluation. Because that operates by pushing information forward there is no reason to throw away information known at compile time like regular typecheckers do. It is a different paradigm and a way of thinking how typing should be done.

More generally, I fail to see why inlining should be at all related to propagation of type information. Your partial evaluator can still make function calls -- no need to inline the body and cause code bloat.

As you can see with the example I've shown you there are some restrictions on that. In order to avoid inlining everything and support recursion is what join points are for. There are examples of them in the Spiral's manual if you are interested.

I'll be happy to answer any questions you might have about them.

Let me get to the other points regarding staging and definitions of IP.

mrakgr commented 5 years ago

What does it mean for staging to be first-class? Can you pass around stages to functions like ordinary values?

'staging' means 'deferred for later'. Personally, I hate the macro-styled brand of staging that Ocaml uses. The way Spiral does it is that it replaces all those staging annotations with vanilla functions and function applications. This has the notable effect of making code much more readable.

Once those stages are there in code, Spiral's main partial evaluation pass then collapses them from compile time to a single stage of runtime.

Furthermore, the way I understand staging is -- you have code being generated in several stages, where some of the stages may be during runtime. OTOH, it doesn't seem like Spiral does code generation at runtime, instead it tries to inline almost everything and then does code gen. Is that a fair characterization or am I missing something?

Spiral does all of its staging operations at compile time. Let me just add that it does not try to inline everything. It inlines what it is instructed to. It is up to the user to be responsible with join points and union types.

You wrote on Reddit that it would be much easier for other languages to borrow ideas from Spiral given the amount of time and effort you have put in developing them. I feel that one of the problems with that is -- when I'm going through your Readme, I barely understand what you're talking about, despite trying my best to look up papers/definitions/examples elsewhere when I don't understand something. Having a small glossary of definitions + small isolated examples (which address exactly 1 language feature at a time) would be immensely helpful.

It is pointless to look up the stuff you would find in Spiral's manual elsewhere as the concepts it introduces are completely fresh. If you want to learn to skate then reading a book on skating will only do you so much good. You would do much better to install Spiral and try it out for yourself for a bit. That should clear up a lot of confusion.

As I said, I am always here to help if you get stuck somewhere. For me, helping others use Spiral would give me valuable feedback in how to improve the documentation.

When I wrote the documentation it was more from the point of what would have been helpful to the me of one year ago. I do not know how other people would approach learning the language. I do think I haven't done enough in the documentation and it feels more like a salesman's pitch at times. Nonetheless I spent a whole month writing it and got very little feedback to motivate me for my effort. There is definitely a lot of it, so it is not like users will be flying blind into the language.

I think the ideal way to try and learn it would be to clone the repo and then try out the examples from the manual one by one until you come to something you do not understand. Then you should just ask me about it. I'll respond and hopefully have something to add to the documentation.

Overall, I still don't understand what your definition of "intensional typing" or "intensional polymorphism" looks like. It would be nice if you could write one down when you have some time to spare.

My philosophy is to align my views enough in the correct direction and let code take care of the rest. I am satisfied with the current state of things...or rather I am most interested in machine learning at the moment than refining terminology. The terms are there more as priming mechanisms to help the user while he goes through the examples.

If you deeply care about what intensional polymorphism precisely means in the context of Spiral I invite you to come up with your own view and elaborate on it yourself. If it fits well we can adopt it.

I think you're misunderstanding the paper. The definition is in the very first line of the abstract, which most likely means that it is the accepted definition in the PL community. For example, Sixten has something similar to (but not exactly) intensional polymorphism, hence it uses a distinct term "type representation polymorphism". Perhaps it would be less confusing if you used an alternate term?

Sixten seems interesting, I've never heard about it before. I'll take a look at the language and the paper later.