evincarofautumn / kitten

A statically typed concatenative systems programming language.
http://kittenlang.org/
Other
1.1k stars 42 forks source link

Parallel concatenation #174

Open suhr opened 7 years ago

suhr commented 7 years ago

Let foo: a b -> x, bar: c d e -> y. Then foo ; bar is the function of type a b c d e -> x y, such as x = a b foo, y = c d e bar.

Infix notation foo `F` bar can be defined as (foo ; bar) F. This allows you, for example, to define the mean value function as dup (sum) / (len). For now, infix notation just reorders f `op` g into f g op.

This is basically a more sane version of forks and hooks in J.

Unresolved question: can we avoid using ndup when number of arguments divides arity of (f ; g ; ...)?

PS: I want to create my own little concatenative array language, so discussions are highly welcome.

evincarofautumn commented 7 years ago

This seems like it would be useful, but I don’t see a good way to make it work. It makes infix desugaring more complicated—in particular, it would depend on type information, which in turn would lock us in to requiring type signatures for definitions.

One thing I would consider is something like #166 for ordinary expressions, e.g., making dup (_ sum / _ length) equivalent to dup -> a, b; (a sum / b length). Another option is to introduce notation for “lifting” operators over functions, or simply adding a set of lifted operators, such as dup (sum ./. length).

Haskell would use (/) <$> sum <*> genericLength or liftA2 (/) sum genericLength in the (a ->) applicative, and I think the Kitten equivalent of \sum \length both_to (/) is good enough and benefits from being more explicit.

suhr commented 7 years ago

Haskell approach is very ad-hoc, it becomes compicated when functions take more than one argument. Concatenative languages have a huge advantage here, they don't require things like (.) . (). The only thing is missing is a handy way to make a composition tree. Lifting is definitely a kludge.

dup ( sum / length)

So for (*) / (+) it will be 2dup (_ * _) / (_ + _)? Too much von Neumann style, to be honest.

It makes infix desugaring more complicated—in particular, it would depend on type information, which in turn would lock us in to requiring type signatures for definitions.

It requites only arity, everything else should be straightforward. So type signature can be just _ _ -> _, for example. Deriving arity is indeed an interesting problem though.

suhr commented 7 years ago

Btw, a somewhat more realistic example:

define rot: _ _ _ -> _ _
    dup cos ; sin -> cos sin    // phi
    2dup x' ; y' where
        x' = (* cos) - (* sin)
        y' = (* sin) + (* cos)

(here cos and sin are shadowed, like in ocaml)

This is not a full point-free non-von Neumann style definition, but it doesn't have to. I'm not a fan of APL/J style one-liners, I prefer simple expressions with lots of where. The thing I like in these language is the ability to code (and to think about code) without being distracted by variables (I also like powerful array operations, but that's an another story).

PS: suddenly, ndup is more meaningful than I thought.

UPDATE: it can be cos (*) instead of * cos. Defining (`f` g) as nid `f` g where n = arity_in(f) - arity_out(g) may be useful, but not necessary.

UPDATE 2: full point-free definition is actually nice too:

define rot: _ _ _ -> _ _
    3dup x' ; y' where
        x' = 3dup (left * cos) - (right * sin)
        y' = 3dup (left * sin) + (right * cos)
evincarofautumn commented 7 years ago

The only thing is missing is a handy way to make a composition tree.

I agree with you there. I just don’t think lifting is necessarily awkward. As long as there’s a concise notation for it, it’s nice to have some indication that operators and functions are being used in lifted form.

We could make some progress with traits—for example, defining an instance of / for functions enables dup (\sum / \length) and dup2 (\* / \+). Instances for higher arities could be added if desired. Likewise, even though Kitten isn’t an array language, instances for combinations of lists and scalars would enable implicit mapping and zipping, e.g., list - function could mean list { dup function call (-) } map. So I could probably make this example work with only the machinery we have now:

define std_dev (…):
  dup (dup ({ (- { dup (\sum / \length) }) \square map sum } / \length)) sqrt

(Although in reality I’d probably factor out variance, mean, and squares.)

But the many dups point to a problem: for some operator @, the expression \f @ \g could reasonably denote either \f \g both_to (@) (apply both functions to one argument and combine the results with @), or \f \g both (@) (apply f to the first argument and g to the second, then combine). You seem to prefer the latter, recovering the former with dup, but the former may be more common, and would make this example much nicer:

({ (- { \sum / \length }) \square map sum } / \length) sqrt

However, your rot example is probably better with the both interpretation.

suhr commented 7 years ago

I just don’t think lifting is necessarily awkward.

It is necessarily ad-hoc.

We could make some progress with traits—for example, defining an instance of / for functions enables dup (\sum / \length) and dup2 (\* / \+).

Again ad-hoc. ; works on any functions of any arity.

You seem to prefer the latter, recovering the former with dup, but the former may be more common,

The reason I prefer the latter is because ; is much more powerful than ordinary combinators. Though ; is not a combinator, it is a core operation on functions, just like concatenation/composition.


Meanwhile, I realized that ; allows to build some nice theory. I even started to write a some kind of paper. I'm not good at writing papers though (or naming programming languages), so any help is highly welcome.

evincarofautumn commented 7 years ago

I like where you’re going with that article. One thing to note is that defining id as dup drop only works for copyable types.

Your reference to the visual “generalised composition” notation makes me realise that ordinary composition and parallel composition form two distinct monoids on dataflow graphs—vertical and horizontal concatenation, respectively. These strongly remind me of both Functional Geometry and the connect and overlay combinators in An Algebra of Graphs.

As for the theory side, it seems you’ve reinvented arrows: parallel composition f ; g is Haskell’s f *** g, and sequential composition f g is f >>> g. For example, your dup (sum / length) notation is basically a nicer version of this arrow code:

-- Sugared
mean = proc x -> do
  s <- sum -< x
  l <- genericLength -< x
  returnA -< s / l

-- Desugared
mean = sum &&& genericLength >>> arr (uncurry (/))

-- In terms of (***):
dup = returnA &&& returnA
mean = dup >>> sum *** genericLength >>> arr (uncurry (/))

Parallel composition can also be implemented in terms of a retain stack. For example, if f : mn means that f consumes m inputs and produces n outputs, then:

f : ab g : cd f ; g : a + cb + d f ; g = retainc f restorec g

suhr commented 7 years ago

Your reference to the visual “generalised composition” notation makes me realise that ordinary composition and parallel composition form two distinct monoids on dataflow graphs—vertical and horizontal concatenation, respectively.

That's exactly what I had in mind while talking about composition tree. So, it is the same for graphs?

As for the theory side, it seems you’ve reinvented arrows

I'm not even surprised, arrows look like a poor's man version of concatenative language. But concatenative languages keep everything amazingly simple.

I'm still impressed how there's almost nothing you have to do to convert concatenative code into ANF. Like ANF is actually concatenative programming.

suhr commented 7 years ago

I implemented a toy language with this feature: https://github.com/suhr/esobsc Also, I realized that I'm extremely bad at writing, and the article is unreadable and needs rewriting. I don't know if I ever finish it.

; allows to make a concatenative alternative to pattern matching in applicative languages. The idea is simple: you can make a function that takes primitives and returns a value of an algebraic type. For example, 2 3 (Some;id Pair)(2 Some) 3 Pair. And there's right cancelation: you can undo Pair to get 2 Some 3 or undo Some;id Pair to get 2 3.

With cancellation, you could write something like this:

map: a Maybe (a -> b) -> b Maybe
Some;id map = apply Some
None _  map = None

or

(some expression) cancel id:
    Left -> print_error
    Right -> show print

I'll write more about it later.

I'm thinking about implementing some more useful language, so I want to understand recursion better. How does Kitten handle recursion, mutual recursion, TCO? Do you have any advices about the implementation?

evincarofautumn commented 7 years ago

Yeah, pattern matching as function inversion seems to play nicely with this notation.

Kitten’s implementation is pretty simple and obvious. It uses two stacks, one for data (the Kitten stack) and one for return addresses & locals (the C stack). Calling a function pushes its return address and jumps (x86 call). If the call is in tail position (last in a definition, or last in a match/if in tail position) then the return address simply isn’t pushed (x86 jmp). When calling a quotation, its closure is unpacked onto the stack as extra arguments to the function (no-op for unboxed closures) and the call proceeds as normal (call or jmp). Nothing special needs to be done for mutual recursion. Special flow-control words such as loop (jump to start of current definition), return (jump to end of current definition), and jump (jump to quotation) can all be implemented as jmp, although I haven’t done this yet.

As a debugging aid, I’ve also considered keeping a “stack log” (like a stack trace) to recover information lost (optimised out) by TCO. Also, destructors don’t work very well with TCO (#156)—a call that’s syntactically in tail position might not be semantically in tail position if there’s a value in scope that has a destructor.

I believe it’s possible to get away with a single stack—and thus further optimisation opportunities such as converting to SSA form—if you impose one restriction: higher-order stack-polymorphic functions such as this cannot be recursive.

define cond<R..., S...> (R..., (R... -> S...), (R... -> S...), Bool -> S...):
  -> f, g, x;
  if (x) { f } else { g } call

This allows you to specialise them for a certain number of stack values, or always inline them, which means that the arity of every call is known statically and you don’t need to reify the stack—it can live in registers and spill to the same stack as return addresses and locals. Unfortunately this means that you can’t have something like a fold where the stack is used as an accumulator, so I’ve chosen not to impose this restriction, even though it would probably improve performance.