Open evincarofautumn opened 4 years ago
Examples where I've needed transformers and arrows would not do:
As a general note to the people participating: following discussions in the GHC repo, I can see that a lot of this work is rooted in mathematical background I do not have or just don't dominate with enough proficiency. Although it may slow everyone else down (and I really hope I'm not a drag), I'd appreciate and think I could be a better asset in this line of work if I understood the discussion well. A lot of this responsibility rests on me (I have to study and read), but I'd appreciate if the concepts were explained for an audience that is not necessarily well-versed in category theory, and potentially, if we have online meetings to discuss some technical details.
One more thing: I've always found the need for proc
and do
pretty annoying. If we are changing syntax, maybe we can change that too?
I can see that a lot of this work is rooted in mathematical background I do not have or just don't dominate with enough proficiency. Although it may slow everyone else down (and I really hope I'm not a drag), I'd appreciate and think I could be a better asset in this line of work if I understood the discussion well.
Not at all, in fact I think it’s crucial that we take things slowly and make sure everyone is on the same page in terms of both theory and practical implementation within GHC—moving slowly and deliberately, sharing many examples, is how we achieve that. As a happy side effect, we might even get some high-quality documentation out of it, something that’s sadly lacking about arrows generally.
I've always found the need for
proc
anddo
pretty annoying. If we are changing syntax, maybe we can change that too?
Imagine we could magically implement whatever notation you wanted—what would it look like?
The syntax seems to need a few parts:
proc
introduces an arrow abstraction->
separates the parameter(s) from the bodydo
is a layout herald for the sequence of statements in the bodyBut there are a few possible avenues for improvement:
Allowing a proc
to take multiple arguments in the usual curried fashion like a lambda, e.g.: proc x y -> do …
instead of proc (x, y) -> do …
(even if the former is desugared to the latter internally somehow)
Replacing proc
with \
, generalising lambdas to Arrow
if they use arrow command notation (-<
and -<<
), e.g.: \ x -> do …
The former feels tractable to me, but probably has some subtleties that @lexi-lambda would understand better than me, being in the thick of it right now.
The latter seems like a benign surface-level change, but I get the sense it could have a lot of follow-on effects, and it’s not clear to me where to stop if we give that mouse that cookie. For example, to disambiguate lambda abstractions from arrow abstractions, we might have to figure out a consistent semantics of -<
/-<<
in isolation, then deal with the complicated and fiddly issues of environment scoping and the command analogue of eta-equivalence (f
≅ proc x -> do f -< x
), then maybe want the addition of row types, then maybe move toward deeply integrating the Arrow and Monad hierarchies—certainly an admirable goal, but a daunting one!
I’ve been thinking about how to arrive at a robust implementation of desugaring using something like the above class hierarchy. I think the general structure of proc
expressions can be handled straightforwardly like so:
Construct the dataflow graph corresponding to the proc
expression
Topologically sort the graph into “stages” ([[Component]]
) where the outer dimension is sequential composition and the inner dimension is parallel composition
Between each parallel component, emit (***)
(or the equivalent with first
and second
)
Between each sequential component, emit (>>>)
Where a stage requires a different association of tuple elements, permute them with assocl
, assocr
, and swap
as needed
Where a value is copied, insert dup
(or (&&&)
)
Where a value is dropped, insert fst
or snd
Here’s a worked example borrowed from the GHC user’s guide:
proc x -> do
y <- f -< x + 1
g -< 2 * y
let z = x + y
t <- h -< x * z
returnA -< t + z
I find the dataflow of such programs easier to visualise with a notation I devised for concatenative programs, a “dataflow table”, in which:
Each box denotes a “function” (here, either a Haskell function or an arrow)
Vertical adjacency represents sequential composition (>>>
)
Horizontal adjacency represents parallel composition (***
)
Empty boxes are the identity
(Unfortunately, this example requires odd overlapping colspans that HTML table renderers don’t support very well, so ASCII art it is…)
Below I’ve also included some additional info:
Next to each row of the table I’ve placed the translation to arrow combinators
Next to each edge I’ve placed the state of the “stack” between the application of each combinator
Along edges between boxes and inside identity boxes, I’ve included variable names if they’re present in the source
Associativity is left implicit here (i.e., the input and output of assocl
and assocr
look the same in the table) partly to avoid even more clutter, and partly because in a concatenative setting, associativity of the stack is ordinarily left implicit, since it always associates downward, and all operations take place at the top of the stack (or are explicitly “dipped” under the top elements, analogous to first
).
┌─x─────────────────────┐ x
│ │
│ dup │ dup
│ │
├─x─────────┼─x─────────┤ (x, x)
│ │ │
│ │ dup │ id *** dup
│ │ │
├─x─────────┼─x─┼─x─────┤ (x, (x, x))
│ │ │ │
│ (+ 1) │ │ │ arr (+ 1) *** (id *** id)
│ │ │ │
├───────────┤ x │ x │ (x + 1, (x, x))
│ │ │ │
│ f │ │ │ f *** (id *** id)
│ │ │ │
├─y─────────┤ x │ x │ (y, (x, x)) = (f (x + 1), (x, x))
│ │ │ │
│ dup │ │ │ dup *** (id *** id)
│ │ │ │
├─y─────┼─y─┼─x─┤ x │ ((y, y), (x, x)) = ( (f (x + 1), f (x + 1)), (x, x) )
│ │ │ │
. . . . assocl
. . . .
. y . y x . x . (y, (y, (x, x))) = ( f (x + 1), (f (x + 1), (x, x)) )
. . . .
. . . . id *** assocr
. . . .
. y . y x . x . (y, ((y, x), x)) = ( f (x + 1), ( (f (x + 1), x), x ) )
│ │ │ │
│ (2 *) │ swap │ │ arr (2 *) *** (first swap *** id)
│ │ │ │
├───────┼─x─┼─y─┤ x │ (_, ((x, y), x)) = ( 2 * f (x + 1), ( (x, f (x + 1)), x ) )
│ │ │ │
│ g │ (+) │ │ g *** (arr (uncurry (+)) *** id)
│ │ │ │
├───────┼─z─────┼─x─────┤ (_, (z, x)) = ( 2 * f (x + 1), ( x + f (x + 1), x ) )
│ . .
│ drop . . snd
│ . .
└───────┤ z x │ (x + f (x + 1), x)
│ │
│ swap │ swap
│ │
├─x─────┼─z─────┤ (x, z) = (x, x + f (x + 1))
│ │ │
│ │ dup │ id *** dup
│ │ │
├─x─────┼─z─┼─z─┤ (x, (z, z)) = ( x, ( x + f (x + 1), x + f (x + 1) ) )
. . .
. . . assocr
. . .
. x z . z . ((x, z), z) = ( ( x, x + f (x + 1) ), x + f (x + 1) )
│ │ │
│ (*) │ │ arr (uncurry (*)) *** id
│ │ │
├───────────┤ z │ ( x * (x + f (x + 1)), x + f (x + 1) )
│ │ │
│ h │ │ h *** id
│ │ │
├─t─────────┼─z─┤ (t, z) = ( h (x * (x + f (x + 1))), x + f (x + 1) )
│ │
│ (+) │ arr (uncurry (+))
│ │
├───────────────┤ h (x * (x + f (x + 1))) + (x + f (x + 1))
│ │
│ returnA │ id
│ │
└───────────────┘
Removing the table and stack state, leaving just the combinators, that becomes:
dup
>>> (id *** dup)
>>> (arr (+ 1) *** (id *** id))
>>> (f *** (id *** id))
>>> (dup *** (id *** id))
>>> assocl
>>> (id *** assocr)
>>> (arr (2 *) *** (first swap *** id))
>>> (g *** (arr (uncurry (+)) *** id))
>>> snd
>>> swap
>>> (id *** dup)
>>> assocr
>>> (arr (uncurry (*)) *** id)
>>> (h *** id)
>>> arr (uncurry (+))
>>> id
Which can be simplified by using first
and second
instead of (***)
and applying the arrow laws, particularly those that extract more parallelism, e.g. (a *** b) >>> (c *** d)
= (a >>> c) *** (b >>> d)
:
dup
>>> ((arr (+ 1) >>> f >>> dup) *** dup)
>>> assocl
>>> second assocr
>>> (arr (2 *) *** first (first swap >>> arr (uncurry (+))))
>>> first g
>>> snd
>>> swap
>>> second dup
>>> assocr
>>> first (arr (uncurry (*) >>> h)
>>> arr (uncurry (+))
Unfortunately, I believe the path leading to the dead-end g
can’t be erased entirely because we can’t discard its effects; but if we could, it would open further simplifications:
dup
>>> ((arr (+ 1) >>> f) *** dup)
>>> assocr
>>> first (swap >>> arr (uncurry (+)) >>> dup)
>>> swap
>>> assocl
>>> first (arr (uncurry (*) >>> h)
>>> arr (uncurry (+))
Critically, through this entire process, we never had to reason “under” arr
. The goal is that arr
is always a black box that only appears when lifting Haskell functions into an arrow, so a proc
expression that only connects other arrows together with no intervening computations won’t require PreArrow
.
All of the “shuffling” operations that previously required arr
are now expressed in terms of the generalised assocl
/assocr
, swap
, dup
, and fst
/snd
, which work on any arrow with the appropriate instances for Associative
, Symmetric
, Relevance
, and Semicartesian
, respectively. Any expression that does not use such operations will also not require the corresponding classes.
Next up, I’m still trying to work out how to desugar the more complex components of proc
notation.
The first order of business is desugaring if
and case
without the needless use of arr
. Ideally we would only need arr
to lift an expression into the arrow for the if
condition or case
scrutinee, but I haven’t yet found a general method of doing so. Maybe we can draw some inspiration from selective applicative functors? In any case, I think we’ll need to extend Cocartesian
slightly to account for it.
The second order (ha) is the desugaring of CartesianClosed
and command abstractions.
I think the existing https://hackage.haskell.org/package/profunctor and https://hackage.haskell.org/package/category classes are all that we need.
@ivanperez-keera they way I think about it is anyone can, and should, cargo cult the cateogry theory text books on this stuff. Expertise is required to justify deviating from those definitions. The problems with arrows today, essentially, is that they deviate without a mathematically-rigorous argument why.
(e.g. a lot of the command stack stuff I don't think deserves the sugar and ad-hoc rules it already has, and would be better served with just a Reader
arrow transformer.)
@Ericson2314: Could you expand a bit onn how those classes would provide the features we need? In particular, it seems like Profunctor
has the same problem as arr
. Also, can you sketch out what a Reader
arrow transformer might look like?
I think some finer-grained structure is warranted because arr
isn’t the only reason I’ve been unable to implement Arrow
for types where proc
notation would’ve been valuable. Breaking up Cartesian
in particular lets you make DSLs with different substructural rules, which is currently exceedingly painful in Haskell—linear types will help somewhat with that.
Each of the classes I proposed has a direct categorical analogue, while the existing extension either doesn’t call this out explicitly, or deviates from it. We should certainly depend on existing classes like Profunctor
where applicable, but also maybe retain existing names like ArrowLoop
instead of Traced
for backward compatibility. What I care about here is the simplest and most flexible structure that gets us something appreciably nicer to use.
If there’s a simpler way of handling command abstractions, I’m all for it, because the whole feature seems extremely delicate, and in fact I’m still learning how it all comes together—I figured I would just sit with it for a while and see if my right brain can find a nice pattern.
As @evincarofautumn notes, arr
is easily definable given both Category
and Profunctor
. In fact, you can get away with just Category
and Functor
:
arr f = fmap f id
dimap f g h = arr g . h . arr f
It would definitely be nice to get Profunctor
and Choice
in base
so they can be made superclasses of Arrow
. Then arr
could be given a default implementation. But if you want to get rid of arr
, you need to do something far more radical.
Also, can you sketch out what a
Reader
arrow transformer might look like?
For what it’s worth, here’s the definition we use:
class Arrow p => ArrowReader r p | p -> r where
askA :: p a r
localA :: p a b -> p (a, r) b
newtype ReaderA r p a b = ReaderA { runReaderA :: p (a, r) b }
instance Arrow p => ArrowReader r (ReaderA r p) where
askA = ReaderA (arr snd)
localA (ReaderA f) = ReaderA (f . arr fst)
It’s useful, though I can’t say I know what @Ericson2314 has in mind when he says
(e.g. a lot of the command stack stuff I don't think deserves the sugar and ad-hoc rules it already has, and would be better served with just a Reader arrow transformer.)
as you generally want to use control operator syntax when calling runReaderA
, i.e. (| runReaderA cmd |) env
. And that needs the command stack.
More generally, when you write
they way I think about it is anyone can, and should, cargo cult the cateogry theory text books on this stuff. Expertise is required to justify deviating from those definitions. The problems with arrows today, essentially, is that they deviate without a mathematically-rigorous argument why.
I am not sure what you are alluding to. I suppose it’s true that all the environment/stack-threading business is bespoke, but that’s just part of arrow notation, not arrows proper.
@evincarofautumn
Yes, as @lexi-lambda says the Cateogry classes avoid arr
. Profunctor doesn't but gives one something more restrictive than arr
and id
, which could be useful. She has also has the definition of ArrowReader
I had in mind.
@lexi-lambda
I am not sure what you are alluding to. I suppose it’s true that all the environment/stack-threading business is bespoke, but that’s just part of arrow notation, not arrows proper.
Sorry, I was conflating the arrow class and arrow notation. (The arrow class as a grab-bag of axioms is also suspect, but that's already been discussed.)
I still think Arrow/Arrow notation today is an overly-complex dead end and a new arrows-like thing should start from scratch (though I do think your proposal improves arrows as they exist today).
as you generally want to use control operator syntax when calling
runReaderA
, i.e.(| runReaderA cmd |) env
. And that needs the command stack.
So your tutorial (and that's the first I really know of these banana brackets!) It seems like their original use-case was "smuggling" data into the arguments of functions that transform arrows since closures are not available. But one can just use ReaderA
to do that.
Perhaps there are other use-cases developed by those who have actually have used banana brackets I am ignorant of.
It seems like their original use-case was "smuggling" data into the arguments of functions that transform arrows since closures are not available. But one can just use
ReaderA
to do that.
Yes, and one can also do it by passing more arguments to functions. Why have abstractions at all?
The usefulness of arrow notation is that it provides a pointful, lexically-scoped embedded language that compiles to the arrow operators, the latter of which are user-specified. You can get lots of useful behaviors by swapping the arrow operators with different ones, without completely giving up on a familiar, intelligible programming model.
But arrows are not always cartesian closed, so arrow notation is distressingly first-order. The illusion breaks down as soon as you need to apply a transformation to an arrow itself: if you exit the notation and re-enter it, none of the “bindings” are in scope. So banana brackets are a syntax to specifically accommodate this “quick exiting and re-entering” so the arrow notation compiler knows to treat the sub-computation as an extension of the enclosing computation.
You can’t really get around this. You can argue the merits of the particular details of the design that was eventually decided upon, but any notation for arrows (or arrow-like things) will face this challenge. You can’t solve this inadequacy of the embedded language with any amount of cleverness in the host language because you aren’t in the host language. Banana brackets are the escape hatch back into the host language.
Yes, and one can also do it by passing more arguments to functions. Why have abstractions at all?
ReaderA
allows:
do a <- foo -< ()
xs <- bar -< ()
runReaderA (mapA (ReaderA $ proc (a, x) -> do
b <- baz -< x
qux -< (a, b))) -< (xs, a)
Yeah, it's not the same a
so that's less slick, but crucially it is the same mapA
.
I think the arrow notation's stack is not unlike a stack of reader effects. I bet somebody could make a HReaderA
taking a type-level list and get something closer still.
I’ve been a bit stuck on the desugaring of case
expressions with as few constraints as possible and especially without arr
. Just if
/ Bool
is easy to handle: we can desugar to x -> x -> a Bool x
or a (Bool, (x, x)) x
(cf. Conal Elliott’s IfCat
), avoiding the current Arrow requirement of arr
in arr (\ … -> if condition then Left … else Right …) >>> (trueBranch ||| falseBranch)
. It seems like case
necessarily depends on the structure of the type, though—given a data type:
data T = Ctor_1 Field_1_1 … Field_1_n | … | Ctor_m Field_m_1 … Field_m_n
A case
expression over that type:
case x :: T of { Ctor_1 … -> …; …; Ctor_m … -> … }
Needs to desugar to something that selects one of a series of continuations to apply:
choose_T
:: a (Field_1_1, …, Field_1_n) r
-> …
-> a (Field_m_1, …, Field_m_n) r
-> a T r
(Note that this takes the continuations as arguments so that it doesn’t require cartesian closure, which is needlessly strong for this.)
Maybe the data type itself could be desugared into sums and products (à la Generic
), and we could use some uniform combinator to handle it, but that might not account for GADTs properly. I think the above combinator could support GADTs relatively easily—add a forall
for any existentially quantified variables, and constraints for packaged-up dictionaries:
data T t where
Ctor_1 :: forall a_1_1 … a_1_x. (C_1_1, …, C_1_k) => Field_1_1 -> … -> Field_1_n -> T t_1
…
Ctor_m :: forall a_m_1 … a_m_x. (C_1_1, …, C_1_k) => Field_m_1 -> … -> Field_m_n -> T t_m
choose_T
:: (forall a_1_1 … a_1_x. (C_1_1, …, C_1_k) => a (Field_1_1, …, Field_1_n) r)
-> …
-> (forall a_m_1 … a_m_x. (C_1_1, …, C_1_k) => a (Field_m_1, …, Field_m_n) r)
-> a (T t) r
(Of course, I’m sure there are subtleties I’m completely missing.)
I’m also assuming that each continuation takes all of the fields of the corresponding constructor, and then explicitly drops those that are unused, rather than having the case
desugar to something that only passes the matched fields, to be explicit about substructural rules.
Also, this only handles matching each constructor once, with no guards; it’s possible to desugar more complicated case
expressions to this form, but may have problems with efficiency or other things I haven’t considered yet.
This is just to get a conversation going—I plan to add examples and detail as I have time.
I often run into types that are almost, but not quite, an
Arrow
—but for the fact that they have no implementation ofarr
usually, but also sometimes because they lack notions of copying, dropping, or swapping (that is, contraction, weakening, and exchange), or most recently because Arrow isn’t polykinded (as a result of containingarr
). Sometimes I can work around this with a free arrow over a bifunctor—essentially building a graph with the free arrow where the nodes are annotated with my effects:But often I can’t get away with this formulation. That’s frustrating, because I really like
proc
notation and the structural guarantees I can get from it. So I’d like to start a discussion about how we want to refactor the class hierarchy going forward, with an eye to increasing flexibility and preserving backward compatibility as much as possible, with a suitable deprecation plan—e.g. adding a language extension to desugarproc
notation in terms of the new hierarchy.I think a good starting point is the monoidal category formulation that @cgibbard brought up in the Constraint-Based Arrow Notation proposal (rephrased slightly):
In particular, figuring out how to divide that up into a hierarchy that’s within a surmountable distance from the current state of affairs. For example, without abstracting over the tensor product and sum types, and just using
(,)
(as above) andEither
:If anyone has examples of cases that are not
Arrow
s but whereproc
notation would be really useful, please share them here, as that’s exactly the underserved market I’m hoping to address—and with luck, popularise arrow notation more broadly by making it more useful.