jameshaydon / lawvere

A categorical programming language with effects
MIT License
266 stars 5 forks source link

Feature Request: Lambda Calculus Notation #9

Open svenkeidel opened 3 years ago

svenkeidel commented 3 years ago

Hi James, thanks for creating this language, this is a great idea!

I would like to propose a new feature. Currently, functions written in point-free style are quite hard to read:

ar squarePlusOne : Int --> Int = identity * identity + 1
ar squarePlusOne' : Int --> Int = * + 1

The code would be much easier to read, if you could use lambda calculus notation:

ar squarePlusOne : Int --> Int = λx. x * x + 1

There is a systematic translation from lambda terms to arrows in a CCC (e.g. https://github.com/svenkeidel/sturdy/blob/backwards-analysis/categories/src/Plugin/Categories.hs#L97-L150). For the example of squarePlusOne, the result of the translation is something equivalent to the original point-free version, although, probably less optimized.

Let me know what you think.

mstewartgallus commented 3 years ago

Edit: don't mistake my enthusiasm for saying you Must do this. I just really really like/hate this particular subject. Also a link back to where I came from https://twitter.com/evincarofautumn/status/1362089922153259016 .

Going to put my thoughts on prototyping a bunch of categorical compiler IRs.

The straightforward compiling to categories translation bloats the code a lot. What you want to do is iteratively remove variables, something like:

data Vars k a b = Vars { out : k a b, removeVar :: Var a -> Maybe (Vars k a b) }

This way you don't plumb variables through places you don't need.

Also for a cleaner language you'd want to base things off http://www.kurims.kyoto-u.ac.jp/~hassei/papers/ctcs95.html "Decomposing typed lambda calculus into a couple of categorical programming languages".

Straightforwardly in a tagless final style you'd have

class (Category k, HasUnit k) => Kappa k where
  kappa :: (k () a -> k b c) -> k (a, b) c
  lift :: k () a -> k b (a, b)

class (Category k, HasUnit k) => Zeta k where
  zeta :: (k () a -> k b c) -> k b (a -> c)
  pass :: k () a -> k (a -> b) b 

Kappa calculus is almost cartesian. You need to add a point "k (a, ()) a" for some reason. If you do two categories call by push value style you have something like:

type F x = (Initial, x)
class (Category eff,, Category pure) => Cbpv eff pure where
  thunk :: eff (F a) b -> pure a (U b)
  force :: pure a (U b) -> eff (F a) b

  unit :: pure a ()
  pop :: (pure () a -> eff b c) -> eff (a, b) c
  push :: pure () a -> eff b (a, b)

  zeta :: (pure () a -> eff b c) - >eff b (a -> c)
  pass :: pure () a -> eff (a -> b) b

See also https://ncatlab.org/nlab/show/call-by-push-value

Thunks seem to be positive functions but I'm still working that out.

You would have a type of closed terms.

newtype Closed k a b = Closed (Kappa k => k a b)

and or a type of free instances

data Free v a b where
  Var :: v a b -> Free v a b

  Pop :: (v () a -> Free v b c) -> Free v (a, b) c
  -- etc..

that implements the type classes.

The dependent typed generalization is something like:

x : 1 -> a, env |- f : b x -> c
-------------
env |- kappa f : exists x : 1 -> a, b x -> c

env |- e0 : (exists x : a, b x) -> c e1, env |- e1 : a
--------
env |- f lift x : c

Having the lift/pass take two arguments instead of just using the unit/counit is cleaner for dependent types. Otherwise the notation get's really messy.

x : 1 -> a, env |- f : b -> c x
-------------
env |- zeta f : b -> forall x : 1 -> a, c x

Also note sum types and cocartesian categories are straightforwardly the opposite category.

x : 0 -> a, env |- f : c -> b
-------
env |- cokappa f : c -> a + b

env |- x : 0 -> a
------
env |- colift x : a + b -> b 

You can kinda interpret these as labels but tbh they're kind of opaque. I'm trying to figure out the difference between positive/negative representations to see if they might be cleaner. You can probably compile cokappa/cozeta stuff to CPS/ Prolog like stuff but I haven't figured the details out. You might also look into "ML is not not ML" if you're interested in cocartesian categories but I couldn't really understand it.

Positive tuples/negative functions is basically based off an adjunction:

Hom (a, b) c <-> (Hom 1 a -> Hom b c)
Hom b (a -> c) <-> (Hom 1 a -> Hom b c)

But these need to be parametric functions/an indexed category because otherwise you get badness.

Negative tuples/positive functions is basically

Hom a b <-> Hom (1, a) b
Hom a b <-> Hom 1 (a -> b)

I think.

It gets kinda messy in some of the details.

DM me more if you decide to pursue something like this

I have a lot of personal projects on something like this for a compiler IR. A few things you might want to look at:

My github inbox is flooded with spam so probably best to contact me on Twitter or Reddit.

jameshaydon commented 3 years ago

@svenkeidel Thanks for the suggestion!

There is a systematic translation from lambda terms to arrows in a CCC [...] although, probably less optimized

Yes I am aware of this, and I must admit there probably isn't a good reason to not have lambdas. Part of me just wanted to see how far one could get with readable, point-free code without lambdas. I think that allowing lambda-syntax, for when one wants to program in a category with enough structure (CCC) is a good idea. Though as you point out the result might be less optimised. One of the design goals of Lawvere is to not assume anything about e.g. concurrency and sequencing, apart from what has been specified by the programmer. For example (when compiling to Hask), all the components of a cone can be executed in parallel, whereas if one was to compile pointful code, it might not preserve this in quite the same way?

In any case, the effectful part of the language does not assume the category is a CCC, so part of the design is to allow nice syntax for programming in this case too.

jameshaydon commented 3 years ago

@sstewartgallus Thanks for all these thoughts; they'll take some time to digest! The first paper you linked looks very interesting, I was not aware of it.

The straightforward compiling to categories translation bloats the code a lot.

Related to my answer to @svenkeidel above, this is one of the motivations of making the source language as "close to category theory" as possible.

mstewartgallus commented 3 years ago

@jameshaydon Grats. Glad you found it interesting even if you choose not to use pointful notation. I have my own personal reasons for investigating this stuff but it is a massive waste of time if you just want to get something done.

One thing that is important is that you kind of need points for certain negative/positive representations of data.

Hom c (a, b) <-> (Hom c a, Hom c b) is a negative representation well suited to call by name evaluation strategies.

But Hom c (a + b) <-> (Hom c a, Hom c b) is a positive representation of sums. Not sure there is a way without using points like in (Hom a 0 -> Hom c b) <-> Hom c (a + b) to have negative sum types which are better behaved under lazy evaluation. Likewise negative tuples require pattern matching/points to deconstruct them let (a, b) = x in y or with category theory the kappa calculus stuff.

Positive exponentials are Home 1 (a -> b) <-> Hom a b I think and are "weird".

It's a common misunderstanding that some types are inherently positive/negative. That difference only emerges in linear logic.

So I would say pointful notation might be more concurrent for some applications.

Definitely not encouraging you to use points though cause they're a pain.

One thing that would be fun to play with and I would find very interesting is coexponential types. I just don't understand them and need to play around with them at a REPL.

jameshaydon commented 3 years ago

but it is a massive waste of time if you just want to get something done.

Sounds like we have been down similar paths! I actually spent quite a long time designing everything (with pointful) without a single line of code and never actually starting because I wasn't sure everything "fit" quite perfectly. I'm now just trying to have fun making something that's probably a mess. I was actually quite pleasantly surprised that something useable materialised before I added any pointful notation.

But Hom c (a + b) <-> (Hom c a, Hom c b) is a positive representation of sums.

I'm not really sure what you mean here, is this meant to be an isomorphism?

Likewise negative tuples require pattern matching/points to deconstruct them let (a, b) = x in y or with category theory the kappa calculus stuff.

Do you have an example for when not having this construct would be particularly frustrating?

mstewartgallus commented 3 years ago

Huh cool.

Yeah Hom c (a + b) <-> (Hom c a, Hom c b) is an isomorphism. You'd see in pseud-haskell Haskell as.

class Category k => HasSums k where
  void :: k 0 a
  -- or more directly inl :: k a (a + b)
  inl :: k a c -> k (a + b) c
  inr :: k b c -> k (a + b) c
 (|||) :: k c a -> k c b -> k c (a + b)

The trouble with this representation is that it doesn't work well with effects in some directs.

For monads you have:

instance HasSums MyMonadKleisliArrow where
  void :: 0 -> m a
  -- or more directly inl :: a -> m (a + b)
  inl :: (a -> m c) -> (a + b) -> m c
  inr :: (b -> m c) -> (a + b) -> m c
 (|||) :: (c -> m a) -> (c -> m b) -> c -> m (a + b)

For the cokappa representation:

instance HasSums MyMonadKleisliArrow where
  void :: 0 -> m a

  cokappa :: ((0 -> m a) -> c -> m b) -> c -> m (a + b)
  colift :: ((a + b) -> m  c) -> (0 -> m a) -> c -> m b

Comonads have the opposite issues.

The point isn't the specific encoding (I'm still figuring out the details) it's that different encodings have different behaviour under different systems of effects. I still don't really understand all the details about limits/colimits so I'm sorry if I'm a bit vague.

Call by push value uses a sort of adjoint encoding of effects to deal with these issues. I don't know much about Freyd categories but it appears they're actually kind of similar that way dividing things into values and computations? My own higher order reinterpretation of the categorical view of cbpv https://github.com/sstewartgallus/compiler-2/blob/master/src/Cbpv.hs seems to kind of match up. It's likely you'll need multiple sum/other types, some for effectful sums some for value sums and so on and so forth.

clayrat commented 3 years ago

Speaking of the kappa/zeta decomposition, there was a somewhat similar idea of using "generalized arrows" as a sort of generalized IR for metaprogramming: http://www.megacz.com/berkeley/garrows/.

There was also some preliminary work on adopting this for Scala (under the name of "circuitries"), but I don't think it ever crystallized into code: https://www.slideshare.net/akuklev/scala-circuitries

@sstewartgallus I think you get coexponentials (aka subtractions) by using mu-calculus instead of zeta, but I'm speculating here, I've only seen them in a sequent calculus setting.