m4rw3r / chomp

A fast monadic-style parser combinator designed to work on stable Rust.
Apache License 2.0
243 stars 19 forks source link

Applicative parsing #26

Open m4rw3r opened 8 years ago

m4rw3r commented 8 years ago

Problem

Some somewhat common constructions using monadic composition are more complex than they have to be. Applicative solves that by being less powerful and lacking context sensitivity (commonly):

do
    a <- digitP
    b <- digitP
    c <- digitP
    d <- digit
    return $ IP a b c d
  where digitP = digit >>= \x -> skip '.' >>= \_ -> return x

-- vs

pure IP <$> digitP <*> digitP <*> digitP <*> digitP <*> digit
  where digitP = digit <* (skip '.')

The default implementation of Applicative for a Monad also works for Chomp:


instance Applicative X where
    pure    = return
    d <*> e = do
      b <- d
      a <- e
      return (b a)
    (*>)    = (>>)
    x <* y  = x >>= \a -> y >> return a

But the issue here is that to be able to write things like

i.ret(IP).apply(digitP).apply(digitP).apply(digitP).apply(digit)

IP must be a function which is partially applied and is invoked like IP(192)(168)(0)(23) which is both very strange to use in Rust (which does not support partial application natively) and very slow since it forces boxed closures to be used to store the intermediate state.

Implementation

Input::ret is kept as is to provide a way to lift values into the Applicative. then is also left as is as a method on ParseResult. skip is added to impl ParseResult since it is very useful even in monadic chaining.

apply is either implemented on a trait in a separate module or directly on the ParseResult.

Variants

There are multiple ways of implementing apply

Following Haskell's default Applicative

i.ret(capitalize).apply(any)
impl ParseResult<T, E> {
    pub fn apply<F, U, V, W>(self, F) -> ParseResult<W, V>
      where T: FnOnce(U) -> W,
            F: FnOnce(Input) -> ParseResult<U, V>,
            V: From<E>;
}

Pros

// Identity:
lhs_m.ret(|x| x).apply(f) === f(rhs_m)
// Composition:
lhs_m.ret(compose).apply(u).apply(v).apply(w) === u(rhs_m).apply(|i| v(i).apply(w))
// Homomorphism:
lhs_m.ret(f).apply(|i| i.ret(x)) === rhs_m.ret(f(x))
// Interchange:
u(lhs_m).apply(|i| i.ret(y)) === rhs_m.ret(apply_arg(y)).apply(u)

Tuple as an argument

fn digitP(i: Input<u8>) -> U8Result<u8> { digit(i).skip(|i| token(i, b'.')) }
i.ret(IP).apply((digitP, digitP, digitP, digit))
trait ApplyArgs<F> {
    type Output;
    type Error;
    fn apply(self, Input, F) -> ParseResult<Self::Output, Self::Error>;
}

impl<A, AT, AE, T, F> ApplyArgs<F> for A
  where F: FnOnce(AT) -> T,
          A: FnOnce(Input) -> ParseResult<AT, AE> {
    type Output = T;
    type Error = AE;

    fn apply(self, i: Input, f: F) -> ParseResult<Self::Output, Self::Error> {
        self(i).map(f)
    }
}

// Impls for tuples, eg. (A, B) where F: FnOnce(A, B) -> Out, A: FnOnce(Input) -> ParseResult<A, AE>, ...

impl ParseResult<T, E> {
    pub fn apply<A>(self, rhs: A) -> ParseResult<A::Output, A::Error>
      where A: ApplyArgs<T>,
             A::Error: From<E> {
        self.bind(|i, f| rhs.apply(i, f))
    }
}

Pros

// Identity:
lhs_m.ret(|x| x).apply(f) === f(rhs_m)
// Composition:
lhs_m.ret(compose).apply((u, v)).apply(w) === u(rhs_m).apply(|i| v(i).apply(w))
// Homomorphism:
lhs_m.ret(f).apply(|i| i.ret(x)) === rhs_m.ret(f(x))
// Interchange:
u(lhs_m).apply(|i| i.ret(y)) === rhs_m.ret(apply_arg(y)).apply(u)

Stacking tuples

The idea here is to stack values into a tuple and then invoke a function with the given values:

i.ap(digitP).ap(digitP).ap(digitP).ap(digit).apply(IP)

Pros

This version reverses the order of the apply operator arguments while still keeping the left to right ordering of the arguments supplied to ap:

// Identity:
lhs_m.ap(f).apply(|x| x) === f(rhs_m)
// Composition:
lhs_m.ap(w).apply(|i| i.ap(u).ap(v).apply(compose)) === rhs_m.ap(w).apply(v).apply(u)
// Homomorphism:
i.ap(|i| i.ret(x)).apply(f) === rhs_m.ret(f(x))
// Interchange:
i.ap(|i| i.ret(y)).apply(u) === i.ap(|i| i.ret(u)).apply(apply_arg(y))

Note that it does not play well with existing values in the Monad/Applicative since they are of type T which is not a tuple while the Applicative functions (ap and apply) only work on tuples in the applicative. IE. Input + ParseResult<T, E> is always a Monad, but only an Applicative if T is a tuple of some kind.