leonoel / missionary

A functional effect and streaming system for Clojure/Script
Eclipse Public License 2.0
630 stars 26 forks source link

code flow deduction #42

Open xificurC opened 2 years ago

xificurC commented 2 years ago

Some pseudocode:

(let [x (m/? (http-x)), y (m/? (http-y)]
  (m/? (http-z x y))

So all http-* calls are tasks and there might be some interdependency between them. Is there a simple way to run this optimally? In this example the results of x and y do not depend on each other so they could be run concurrently.

xificurC commented 2 years ago

I'm not looking for a solution where I manually rewrite it to

(let [[x y] (m/join ...)] ...)

but to a macro that rewrites the code optimally for me. Another example:

(m/? (m/sp (time (let [x (m/? (m/sleep 1000 1)) y (m/? (m/sleep 1000 2))] (+ x y)))))

takes 2 seconds but could run in 1 without any code changes if m/sp knew it

leonoel commented 2 years ago

While technically possible, I think this is out of scope. The main challenge is, you cannot be sure the optimization is safe because there can be side effects in parallel branches. Of course you can assume the code is pure, but then it's not clojure anymore, it's haskell. I find it valuable to allow arbitrary side-effects in sp blocks, to achieve that we need to inherit clojure's deterministic evaluation order and therefore everything must be sequential.

IMO the right way to tackle this problem is to write a separate macro working on a restricted DSL explicitly forbidding side-effects, and emit a combination of lets and joins matching the topological sort of the input DAG.

xificurC commented 2 years ago

I now realized I was probably a bit inspired by let-flow from manifold. Since a task is an arbitrary 2-arity function there's no simple way to reimplement this trick in missionary, an explicit m/? is required. I'm not sure if I'd be able to write a code rewriter that tackles this.

Of course you can assume the code is pure, but then it's not clojure anymore, it's haskell.

Wouldn't ?= have the same issue? As long as one is explicit about the pitfalls I don't see anything non-clojury in that. Subjectively, of course, this is your library ;)

xificurC commented 2 years ago

Re-reading it sounds like I requested sp to do this but a separate macro would actually be better.

leonoel commented 2 years ago

manifold's let-flow does two things : implicit type conversion and parallel branch detection. Both concerns are largely orthogonal.

I'm not really interested in implicit type conversion, as I don't think it's a good pattern. The task protocol doesn't define a concrete type (on purpose), so to make it work you would have to introduce a wrapper type and rewrite combinators to make them aware of this type.

Parallel branch detection basically translates each binding into a zip on the dependencies of the bound expression (this is the hardest part, you need to macroexpand the expression and detect free variables) followed by a chain to actually compute the value. The same strategy could be applied to tasks using join, but you would also need to assign each intermediate result to a dfv to allow sharing (tasks are not memoized, unlike deferreds aka futures aka promises).

TBH I find let-flow rather boring. There's no reason to restrict such a tool to only let expressions, in fact you could leverage the entire language to describe the computation graph. Function application can be used to describe parallel combination, so you really just need to provide extra operators to move one layer up or down. That's pretty much what we're currently doing at Hyperfiddle to implement Photon, here we are targeting continuous flows but in theory you could target any type as long as you're able to do flatten, parallelize and share. Here's what your example could look like :

(task (+ ~(m/sleep 1000 1) ~(m/sleep 1000 2)))  ;; returns 3 in 1s

Wouldn't ?= have the same issue? Subjectively, of course

Here's an objective argument : ?= is not redefining an existing behavior, as there's no such thing as ambiguous expressions in standard clojure. Therefore, the rule of least surprise is not broken.

xificurC commented 2 years ago

Function application can be used to describe parallel combination, so you really just need to provide extra operators to move one layer up or down. [...] in theory you could target any type as long as you're able to do flatten, parallelize and share

I don't understand this part of your explanation. But before we dig deeper I'd like to know if your out-of-scope reply meant it is out of scope for sp blocks or for the library in general. I see code that runs multiple async blocks, or doesn't because it's hard to do right. I believe missionary was born to address some of the issues of other solutions. But writing joins feels a bit low-level and messes up the code structure when the graph gets more complex.

  (let [a (sync-a)
        b (sync-b a)
        c (sync-c)
        d (sync-d b c)]
    d)
  (let [[a c] (m/? (m/join vector (async-a) (async-c)))
        b (m/? (async-b a))
        d (m/? (async-d b c))]
    d)

Even in this simple graph it takes time to untangle the dependencies and the resulting code is less readable. I know side-effect-wise the code is not the same, but often that is not an issue.

Am I moving in the wrong direction here? I don't know what are you guys up to in photon, but I'm having a hard time wrapping my head around the coding style I should be using, the patterns that should arise when using missionary.

leonoel commented 2 years ago

I'd like to know if your out-of-scope reply meant it is out of scope for sp blocks or for the library in general.

I think it's out of scope for sp blocks, and I'm still not sure if it has a place in the library. I would prefer the library to remain a restricted set of primitive operators, and I would rather keep it small. The feature we're talking about is comparatively high-level and there's more than one way to solve the problem, so my gut feeling is that it would be a better fit in a third-party toolkit.

I see code that runs multiple async blocks, or doesn't because it's hard to do right. I believe missionary was born to address some of the issues of other solutions. But writing joins feels a bit low-level and messes up the code structure when the graph gets more complex.

You're right, what missionary provides is probably too low-level to deal with complex graphs. However, I would argue it's a sound foundation to solve a large range of problems, especially this one. The missing part is a DSL able to efficiently represent the computation graph, and a mechanism to turn this representation into a combination of parallel and sequential compositions, which can leverage join and sp. I think it's a good idea to make the DSL a superset of clojure itself, but that's not the only way (and not the easiest).

dustingetz commented 2 years ago

Function application can be used to describe parallel combination

(h (f x) (g x))

The s-expression (h ) is a function call h that takes two parameters, and the parameters — if the functions are pure – cannot possibly depend on each other, and therefore (f x) and (g x) can be computed in parallel (by a sufficiently smart compiler).

Now imagine a macro:

(p/defn foo [x] (h (f x) (g x)))

p/defn is a macro that contains a custom clojure analyzer that compiles this function body to a DAG, a diamond with one input x, one output (unnamed), and x is reused. Most any Lisp expression can be compiled to a DAG in this way.

Now that the AST is a DAG, evaluate the DAG as an async stream by compiling or interpreting to missionary operations. Voilà, Reactive Clojure/Script.

From this perspective, Missionary is a reactive virtual machine with reactive assembly instructions. It has concurrency operators you need to write low-level concurrent programs that are referentially transparent. Since it's RT it is a suitable target for new kinds of abstractions that compose with higher abstraction ceiling.

The cost of this approach/philosophy is that missionary is not really a "solution" for application developers, it's for making high fidelity reactive infrastructure of the 2030s.

I think it eventually may be possible to express all this in fewer layers, but I don't know what the layers are yet.

xificurC commented 2 years ago

I would prefer the library to remain a restricted set of primitive operators, and I would rather keep it small. The feature we're talking about is comparatively high-level

Understood, makes sense.

I would argue it's a sound foundation to solve a large range of problems, especially this one

No argument there, I admire your work

The s-expression (h ) is a function call h that takes two parameters, and the parameters — if the functions are pure – cannot possibly depend on each other, and therefore (f x) and (g x) can be computed in parallel (by a sufficiently smart compiler).

Aren't we back to the "this is not Haskell argument"? Correct me if I'm wrong but I thought clojure evaluates arguments left to right and this change would break the previously mentioned principle of least surprise.

Voilà, Reactive Clojure/Script.

When someone says reactive I typically expect computation relationships a la excel formulas, clj javelin or cells etc.

From this perspective, Missionary is a reactive virtual machine with reactive assembly instructions. It has concurrency operators you need to write low-level concurrent programs that are referentially transparent. Since it's RT it is a suitable target for new kinds of abstractions that compose with higher abstraction ceiling.

The cost of this approach/philosophy is that missionary is not really a "solution" for application developers, it's for making high fidelity reactive infrastructure of the 2030s.

I think it eventually may be possible to express all this in fewer layers, but I don't know what the layers are yet.

I agree and I find this problem space very interesting, that's why I keep coming back to this library. Working on these problems must be very rewarding!

leonoel commented 2 years ago

Aren't we back to the "this is not Haskell argument"? Correct me if I'm wrong but I thought clojure evaluates arguments left to right and this change would break the previously mentioned principle of least surprise.

Yes, we're changing the evaluation order so it's not clojure anymore, that's why it has to be a new macro with new expectations from the user. In contrast, sp and ap are designed to be fully compatible with clojure, so relaxing the evaluation model would break the initial promise.

When someone says reactive I typically expect computation relationships a la excel formulas, clj javelin or cells etc.

Indeed, this is the problem space we are targeting but our syntax is slightly different : we use let bindings instead of cells and s-expressions instead of formulas, and macroexpansion just works because the syntax is the same as clojure. The key insight is, the underlying problem is similar to what you are trying to achieve, and could be solved in the exact same way. If you can describe an generic computation graph, you can evaluate it in any context as long as you can provide the right set of operators.

xificurC commented 2 years ago

that's why it has to be a new macro with new expectations from the user

Understood, thanks.

Indeed, this is the problem space we are targeting but our syntax is slightly different : we use let bindings instead of cells and s-expressions instead of formulas

This sounds similar to let-flow to me, but probably isn't. Is this the photon library we are talking about? Any code samples? Will it be open sourced? Do you need a hand? :)