snowleopard / selective

Selective Applicative Functors: Declare Your Effects Statically, Select Which to Execute Dynamically
MIT License
202 stars 21 forks source link

Do you want a SelectiveDo proposal? #40

Open turion opened 3 years ago

turion commented 3 years ago

It seems to me that Selective functors could in principle desugar more do notations than Applicatives. For example:

main = do
  number <- (readMaybe :: String -> Maybe Int) <$> getLine
  case number of
    Just _dontuseme -> putStrLn "Yes, that's a number"
    Nothing -> putStrLn "Nope, not a number"

I believe this has the same semantics as:

main = maybe (Right (Left ())) Left . (readMaybe :: String -> Maybe Int) <$> getLine
  <*? (putStrLn "Yes, that's a number" *> pure (const (Right ())))
  <*? (putStrLn "Nope, not a number" *> (pure id))

The latter only needs Selective.

In general, all if statements could be desugared into Selective, and case could be desugared iff the variables bound in the branches are not used in a binding way.

I've asked some of this here: https://discourse.haskell.org/t/efficient-selectivedo-and-binary-search/1657/3 but I wanted to know your opinion. I think this would be interesting, especially in situations where no efficient monads exist, e.g. some streaming approaches.

snowleopard commented 3 years ago

Personally, I'd love to have the SelectiveDo extension but I think that this proposal is unlikely to be accepted right now. Selective functors are still too new and too niche. It would be hard to justify the cost of changes to GHC.

Perhaps, it's possible to implement SelectiveDo as a GHC plugin? If yes, then that might be a good starting point.

Even writing a good proposal would be an interesting challenge because there are still a lot of blank spots. For example, using multiway selective functors may be a better approach for SelectiveDo extension, because they allow to efficiently implement branching over any type with a finite number of constructors (without the need for bisection).

If you are interested in trying to write a proposal despite the low chances of success, then I'm happy to help!

turion commented 3 years ago

Perhaps, it's possible to implement SelectiveDo as a GHC plugin? If yes, then that might be a good starting point.

A GHC plugin sounds like a great idea. I've never done that, so I'm not sure, but maybe at some point I'll look into that. If it works and is useful, a proposal might be the next step.

Even writing a good proposal would be an interesting challenge because there are still a lot of blank spots. For example, using multiway selective functors may be a better approach for SelectiveDo extension, because they allow to efficiently implement branching over any type with a finite number of constructors (without the need for bisection).

That's interesting. I would have thought that you can already use multiway selectives with RebindableSyntax. The downside is that every value you bind must satisfy the Enumerable constraint.

SelectiveDo on cases and ifs would work without an additional constraint, that's an advantage. And in practice I think cases aren't usually so big in do notation that you'd feel the slowdown from the linear complexity.

snowleopard commented 3 years ago

The downside is that every value you bind must satisfy the Enumerable constraint.

This is not exactly accurate. The Enumerable constraint is on the type of tags t, not values. This means you can desugar pattern matches with appropriate RHS expressions into an efficient match statement. For example, consider this contrived do snippet:

data T a b = T0 | T1 a | T2 a b 

getT :: Selective f => f (T a b)

x :: Selective f => f (() -> Maybe a)
y :: Selective f => f (a -> Maybe a)
z :: Selective f => f (a -> b -> Maybe a)

example :: Selective f => f (Maybe a)
example = do
    t <- getT
    -- These RHS expressions are OK because they don't inspect values
    case t of
        T0     -> x <*> pure ()
        T1 a   -> y <*> pure a
        T2 a b -> z <*> pure a <*> pure b

With multiway selective functors we can desugar it as follows:

data Tag a b c where
    Tag0 :: Tag a b ()
    Tag1 :: Tag a b a
    Tag2 :: Tag a b (a, b)

instance Enumerable (Tag a b) where
    enumerate = [Some Tag0, Some Tag1, Some Tag2]

encode :: T a b -> Sigma (Tag a b)
encode = \case
    T0     -> Sigma Tag0 ()
    T1 a   -> Sigma Tag1 a
    T2 a b -> Sigma Tag2 (a, b)

example :: forall a f. Selective f => f (Maybe a)
example = match (encode <$> getT) handle
  where
    handle :: forall x b. Tag a b x -> f (x -> Maybe a)
    handle = \case
        Tag0 -> x
        Tag1 -> y
        Tag2 -> uncurry <$> z

Here the type T a b does not need to be enumerable.

turion commented 3 years ago

Ok, fair enough, the situation is more comfortable than I previously thought. So you could e.g. bind on a Maybe a without a being Enumerable, because Maybe already is. Still, at some point you either need an Enumerable instance for some container or an encode function.

kozross commented 2 years ago

I think you might get some mileage out of QualifiedDo?

turion commented 2 years ago

QualifiedDo will get us to approximately the same solution like RebindableSyntax, I think. So the same advantages and disadvantages apply.

snowleopard commented 2 years ago

I think you might get some mileage out of QualifiedDo?

@kozross I do want to experiment with QualifiedDo. I started doing some work on this (a very early draft is in #30) but got distracted by other things. I'll try to revive it soon!

rashadg1030 commented 2 years ago

I'm interested in helping implement this feature. What are the next steps? Has it been decided on whether to use RebindableSyntax, QualifiedDo, TemplateHaskell, or a plugin for this?

turion commented 2 years ago

@rashadg1030 It's still in (before?) the experimentation phase. I think you can help by:

rashadg1030 commented 2 years ago

@turion Thank you for your response. First let me describe my use case. It's related to my comment and Matthew Pickering's response here. This is the problem I'm trying to solve

Problem:

I've been working on a web framework called Okapi which is essentially a monadic EDSL for parsing web requests and returning web responses. I use it for creating servers. A simple example looks like this.

{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TypeApplications #-}

module Main where

import Control.Applicative ((<|>))
import Data.Aeson (ToJSON)
import Data.Text
import GHC.Generics (Generic)
import Okapi

main :: IO ()
main = runOkapi id 3000 calc

type Okapi a = OkapiT IO a

calc :: Okapi Response
calc = do
  get
  pathSeg "calc"
  addOp <|> subOp <|> mulOp <|> divOp

addOp :: Okapi Response
addOp = do
  pathSeg "add"
  (x, y) <- getArgs
  okJSON [] $ x + y

subOp :: Okapi Response
subOp = do
  pathSeg "sub" <|> pathSeg "minus"
  (x, y) <- getArgs
  okJSON [] $ x - y

mulOp :: Okapi Response
mulOp = do
  pathSeg "mul"
  (x, y) <- getArgs
  okJSON [] $ x * y

data DivResult = DivResult
  { answer :: Int,
    remainder :: Int
  }
  deriving (Eq, Show, Generic, ToJSON)

divOp :: Okapi Response
divOp = do
  pathSeg "div"
  (x, y) <- getArgs
  guard403 [] "Forbidden" (y == 0)
  okJSON [] $ DivResult {answer = x `div` y, remainder = x `mod` y}

getArgs :: Okapi (Int, Int)
getArgs = getArgsPath <|> getArgsQueryParams
  where
    getArgsPath :: Okapi (Int, Int)
    getArgsPath = do
      x <- pathParam
      y <- pathParam
      pure (x, y)

    getArgsQueryParams :: Okapi (Int, Int)
    getArgsQueryParams = do
      x <- queryParam "x"
      y <- queryParam "y"
      pure (x, y)

The calc server/parser defined above can respond to the following URLs:

The issue with this monadic EDSL approach compared to the approaches that other Haskell web frameworks use is the lack of information about the request parsers. This means Okapi can't have typed routes or openAPI doc generation like Servant. Unless there was a way to inspect the functions and generate an "AST" composed of Okapi's primitives like pathParam, pathSeg, guardError403, okJSON, etc. If this was possible, I would know the structure of the parser and be able to generate safe functions that can create URL strings or generate openAPI documentation. For example (related to the example above):

addRoute1 = [route|calc/add/:Int/:Int|]
addRoute2 = [route|calc/add?x:Int?y:Int|]

subRoute1 = [route|calc/sub/:Int/:Int|]
subRoute2 = [route|calc/sub?x:Int?y:Int|]

mulRoute1 = [route|calc/mul/:Int/:Int|]
mulRoute2 = [route|calc/mul?x:Int?y:Int|]

divRoute1 = [route|calc/div/:Int/:Int|]
divRoute2 = [route|calc/div?x:Int?y:Int|]

The idea is that with these quasiquotes, Okapi would be able to check if the routes actually exist within the user's parser definitions and then generate a function that the user can use in their HTML templates or whatever else. addRoute1 would generate a function addRoute1 :: Int -> Int -> URL, for example. Before generating the function though, I could check if the route actually exists by inspecting the definition of calc, then see that it is composed of pathSeg "calc" and the addOp parser, and then inspect that. Then I'd see that it uses pathSeg "add" and getArgs. I'd then inspect the definition of getArgs, etc. and then see that yes addRoute1 is a valid route and generate the URL function. If not, provide an error notifying the user that there is a mismatch, at compile time. The idea is similar to the one mentioned in this paper.

Anyway, this all starts with being able to inspect the body of a monadic parser at compile time. My use case is pretty niche (an unknown web framework), but I could see this also being useful for something like Parsec or Megaparsec. You'd be able to optimize monadic parsers I assume.

I could push all that info to the type-level, but Servant already does this. I could do something like Yesod or Happstack, but that increases the complexity that the end user is faced with, and it's already been done so no use in me doing the same thing. I could also do something like optparse-applicative and retain more static information by using an applicative parser instead of a monadic one, but it's not as ergonomic and doesn't allow data dependent effects (something like if (someQueryParam == 0) then (okJSON [] FooData) else (okJSON [] BarData) isn't possible). I basically want to cheat the system and have an easy to use EDSL, but also have the ability to inspect the available static information to generate useful code.

If only I could inspect the RHS of the definitions at compile time and inspect what primitives they use, I'd be able to do all sorts of cool things! Even if it was Core, I'd be happy with that.

Here was Matthew Pickering's response:

Well I am not sure that you can achieve what you want because what if someone wrote a request handler which was something like:

foo :: Okapi Response
foo = do
  pathSeg "foo"
  x <- getArgs
  pathSeq x
  ...

Then the list of acceptable paths isn't statically known because it depends on the argument of a request. Perhaps you would be interested in designing an interface based on selective functors (ie https://mpickering.github.io/papers/parsley-icfp.pdf). That would allow you to statically inspect what the structure of paths looked like but allow branching based on results.

That was the comment and Matthew Pickering's response. Here's a short summary of what I'm trying to accomplish:

I'm working on an easy to use framework that allows the user to parse web requests and return responses. I'm using Monad because it allows data dependent effects and the end user can use the intuitive do notation to describe their web request parser. The downside is there is no static information to generate routes or generate API documentation. Applicative was considered, something similar to optparse-applicative, but there are no data dependent effects. I thought Template Haskell could be used to go through the parser definition at compile time, but this won't work.

I've been lead to Selective Functors, which seem to be the key to having a web framework which allows the user to define their web requests in a nice notation (if there is some sort of do-notation), while still retaining enough static information to have type safe URLs, and generate OpenAPI documentation. This would be a great use case of Selective in my opinion, if it's possible. Perhaps @turion or @snowleopard can provide some insight into whether or not Selective Functors will actually help get the interface that I want and still have enough static information to analyze the web request parsers. If so, I'm very interested in working on a plugin, or anything else that would allow end users of the framework to use do-notation.

EDIT: I would also need to be able to compose Selective Functors. The core of this framework is the type:

newtype OkapiT m a = OkapiT {unOkapiT :: ExceptT.ExceptT Failure (StateT.StateT State m) a}
  deriving newtype
    ( Except.MonadError Failure,
      State.MonadState State
    )

Judging by the discussion in #12 this will probably be difficult or impossible to do with Selective. Not sure. Any thoughts?

turion commented 2 years ago

I'd guess that this can be done with selectives, and that you'd profit from a ghc plugin that desugars do notation into selectives, but I'm not sure.

You should be able to compose some specific selective functors. #12 is about the most general case, which is not trivial, but if you read the whole thread you'll notice that we found already 2 possibilities of composing selectives, so you can have a look which one is the right one for you.

rashadg1030 commented 2 years ago

I'd guess that this can be done with selectives, and that you'd profit from a ghc plugin that desugars do notation into selectives, but I'm not sure.

You should be able to compose some specific selective functors. #12 is about the most general case, which is not trivial, but if you read the whole thread you'll notice that we found already 2 possibilities of composing selectives, so you can have a look which one is the right one for you.

Awesome. Thank you for your input. I think it's worth it for me to go down this rabbit hole further and see what happens.

Now, about the plugin. I remember reading a random comment on reddit or something that said GHC plugins can only manipulate GHC Core, and not Haskell syntax. Maybe I'm not remembering what I read correctly or I misunderstood. If I were to implement a GHC plugin and I've never done it before, where's a good place to start? Before I get started on this though, I should probably translate my framework's monadic interface into a selective one, and see how it works.

turion commented 2 years ago

If I were to implement a GHC plugin and I've never done it before, where's a good place to start?

Sorry, I've never done that myself and don't know where to start. I'd imagine that Matthew Pickering & friends have a lot of good talks & tutorials on that.

Before I get started on this though, I should probably translate my framework's monadic interface into a selective one, and see how it works.

Yes, definitely. Here's a plan you can follow:

  1. Have a look at https://hackage.haskell.org/package/selective-0.4.2/docs/Control-Selective-Multi.html#t:Selective, which is an alternative definition of Selective functors.
  2. Define instances for Enumerable for all the types you want to branch on.
  3. Try to implement your compile time analysis with Selective (you'll need match I guess)
  4. If this works, have a look at GHC plugins
snowleopard commented 2 years ago

@rashadg1030 I like your application and I hope you will be able to use selective functors.

However, I'm unsure how you plan to handle Matthew's example:

foo :: Okapi Response
foo = do
  pathSeg "foo"
  x <- getArgs
  pathSeg x
  ....

Trying to introspect this computation using selective functors will lead you to the infinite set of possible effect sequences, where the path segment foo is followed by an arbitrary segment. You will probably need to switch from strings to a more structured type for representing segments. Then, I think, it might work.

rashadg1030 commented 2 years ago

@rashadg1030 I like your application and I hope you will be able to use selective functors.

However, I'm unsure how you plan to handle Matthew's example:

foo :: Okapi Response
foo = do
  pathSeg "foo"
  x <- getArgs
  pathSeg x
  ....

Trying to introspect this computation using selective functors will lead you to the infinite set of possible effect sequences, where the path segment foo is followed by an arbitrary segment. You will probably need to switch from strings to a more structured type for representing segments. Then, I think, it might work.

@snowleopard Glad you brought this up since this was one concern I had as well. I thought that if the type could be made an instance of Enumerable it would be fine? And I thought Many was to be used for types with infinitely many values? I probably misunderstood. In this case, I would need it for Text. I assume I could wrap the Text with Many and it would technically work, but the performance would make it impractical?

snowleopard commented 2 years ago

@rashadg1030 My question is partly about performance but also about your intent. I can believe that you can write a function that type checks and gives you this (possibly infinite) list of possible effects. But what would you do with such a list?

rashadg1030 commented 2 years ago

@snowleopard Two things about the example:

foo :: Okapi Response
foo = do
  pathSeg "foo"
  x <- getArgs
  pathSeg x
  ....
  1. This sequence would throw a runtime error. I should've mentioned that. You can't parse path segments after query parameters. Only query parameters after path segments. This is only enforced at runtime. Would I be able to enforce the allowed sequence of effects before execution?

  2. To answer your previous question:

    @rashadg1030 My question is partly about performance but also about your intent. I can believe that you can write a function that type checks and gives you this (possibly infinite) list of possible effects. But what would you do with such a list?

Using the above example, I basically want to be able to know what x is in pathSeg x, at compile time. I don't see how this could be possible though. Or what if I had:

```haskell
foobar :: Okapi Response
foobar = do
  let bar = "bar"
  pathSeg $ "foo" <> bar
  ....
```

I guess it could be known that `pathSeg` takes in `"foobar" :: Text` at compile time, or is this not possible either?

The goal is to generate routes, so I need to know the pathSegs that are a part of the "parser", before I even execute it.

snowleopard commented 2 years ago

@rashadg1030 It's hard for me to give any definitive answers. It seems best to just give it a try and see what happens. I suggest not to jump into implementing support for SelectiveDo or anything like that straight away -- just try using selective functors for your use-case without any syntax sugar and see how it goes.

turion commented 2 years ago

During commenting on https://github.com/snowleopard/selective/issues/57 I found the following brain teaser. Let us assume we have a statement like:

thong :: Monad m => m (t -> Either a b) -> m t -> m (a -> b) -> m b
thong x y z = do
  f <- x
  r <- y
  case f r of
    Left a -> ($ a) <$> z
    Right b -> pure b

How do we desugar it? Maybe like this:

thing :: Selective f => f (a1 -> Either a2 b) -> f a1 -> f (a2 -> b) -> f b
thing x y z = (x <*> y) <*? z

But I don't know a reason why the Applicative should bind stronger than the Selective. We can also define something like this:

thang :: Selective f => f (a1 -> b) -> f (Either a2 a1) -> f (a2 -> a1) -> f b
thang x y z = x <*> (y <*? z)

It has a different type than before, but maybe it is possible to define a desugaring of the do notation where <*? is bound stronger.

snowleopard commented 2 years ago

@turion Aha, cool! I see where you are getting at. To simplify a bit, consider the following snippet:

execAndSelect :: Monad m => m () -> m (Either a b) -> m (a -> b) -> m b
execAndSelect x y z = do
    x
    ab <- y
    case ab of
        Left a -> ($ a) <$> z
        Right b -> return b

This can be desugared into (x *> y) <*? z or x *> (y <*? z) using Selective interface. For rigid selective functors, these two expressions have the same meaning (the interchange law). But which should we do in general? To me, x *> (y <*? z) seems like a more natural way to desugar this do-notation snippet, because it's more compositional (you can desugar independent parts separately and then combine them via *>).

turion commented 2 years ago

Maybe one feels more natural, but reasoning about do notation can be very confusing if rigidity isn't given. We need to be able to reason about each part of the do notation locally. Imagine you have more than three statements. Maybe it isn't obvious immediately in which parts we will need <*> and where we need <*?, so it is not obvious how the whole statement is going to be desugared.

And when you want to analyse the code, it is important to be able to disect the statement locally. This isn't given here anymore: We cannot look at the first two statements in isolation anymore. We might believe that this means anything:

do
  x
  ab <- y

And we might reason something like "Ah, x was a Failure e1, and y is a Failure e2, so up to here we have collected Failure $ e1 <> e2. And this is true if the following statement doesn't use select! Maybe in a previous iteration of the code, we wrote pure z' instead of <$> z, so the case expression is just pure _, so we only need Applicative to desugar it. But now we add an effect in the Left branch further down, and suddenly our reasoning about the first two lines is wrong!

That's why I believe that rigidity is necessary to make SelectiveDo useful.

snowleopard commented 2 years ago

@turion I feel like you are arguing my case that x *> (y <*? z) is preferable to (x *> y) <*? z :)

With x *> (y <*? z) you get the compositionality you want: you can reason about snippets of code in isolation, without thinking about what comes next. With (x *> y) <*? z you can't because your x can get clumped with a subsequent y under select.

turion commented 2 years ago

No, I don't think so: My point is that we might reason about x and y together, but we aren't even allowed to do that because x *> y doesn't even occur at the end. So the reasoning is broken by adding code further down.

ribosomerocker commented 1 year ago

Has any more work been done on this? Personally, I find the Enumerable part of the do notation kind of inconvenient

snowleopard commented 1 year ago

Has any more work been done on this?

Not that I know of. I don't think there is enough interest/motivation for someone to actually do the work.

Personally, I find the Enumerable part of the do notation kind of inconvenient

I don't understand what you mean.