BebeSparkelSparkel / biparsing

Bidirectional Parsing. Work in Progress
31 stars 0 forks source link

biparsing

Biparsing is a bidirectional programming technique that specializes in constructing parsing and printing programs simultaneously.

Until this package there was not a usable, only my opinion, package that allowed for the construction of useful biparsers.

Why you should care!

Getting Started

Start with the package biparsing-mixes which has an easier interface to start with. It currently allows using the monads IO and Either and the types String, ByteString (Strict and Lazy), Text (Strict and Lazy), and [a]. These monads and types can be easily expanded to others; the tests for biparsing-core use Maybe, Vector, Seq.

Most of the combinators you are in, or should be in (please help), Biparse.General, BiparseBiparse.Biparser`,

Examples

See directory examples

A quick and simple example that both defines the parser and printer

product :: IsoEasy IndexContext Text (Text, Int)
product = do
  take '('
  s <- takeWhile (/= ',') `upon` fst
  take ','
  i <- intBaseTen `upon` snd
  take ')'
  return (s, i)

print $ decodeEasy product () "(Some text,123)"
print $ encodeEasy product ("Some text", 123)

Assistance Needed

Required Features for Completeness

Reference Links

Notable Quotes

"IMO it's a mistake (although not an infrequent one) to try and use an optic as a parser or a printer or whatever. It's like trying to light a fire by striking your lighter against a rock. The whole point of polymorphic profunctor optics is that they can adjust arbitrary profunctors, including natural representations of parsers and printers" masaeedu

"I suspect a "clean" way to do this sort of thing will eventually reveal within a profunctor encoding, but I haven't put much time into it yet :)" Chris Penner

"Are there any bidirectional parser libraries ready to use?" TheMatten

"None that are good. Well "good" is not the right word here Basically the best current approach to bidirectional parsing in Haskell does not have the ideal interface" chessai

Other Implementations and their Problems

There are other implementions for biparsing but they did meet my requirements.

Applicative Functor Composition

Currently, a viable solution for constructing biparsers. The drawback is that it is hard to modify since it requires that the parser constructor function must be defined separately from the parsing logic. Also, for serialization an extractor function is required to remove what is to be serialized from the parse type.

data Person = Person String String

-- constructor function
person :: String -> String -> Person
person f l = Person f l

-- extractor functions
firstName :: Person -> String
firstName (Person x _) = x
lastName :: Person -> String
lastName (Person _ x) = x

personBiparser :: Biparser String Person
personBiparser = 
  person <$> -- sets constructor function
  many firstName alpha <*> -- Parsing: `many` passes the consumed alpha characters as the first argument of the `person` constructor function. Serializing: `many` uses the extractor function `firstName` to get the first name charcaters from a `Person` instance and passes each single caracter to the alpha serializer
  (spaces *> many lastName alpha) -- Parsing: `spaces` skips the spaces, `many` passes the consumed alpha characters as the second argument of the `person` constructor function. Serializing: `spaces` writes a " " space. `many lastName alpha` does what the above line does.

spaces :: Biparser a String
spaces = many (const [' ']) space

many :: (a -> [b]) -> Biparser b c -> Biparser a [c]
many extractor biparser = ...

Another, example from the linked codec package.

data RecordB = RecordB
  { recordBString :: String
  , recordBDouble :: Double
  } deriving (Eq, Ord, Show)

recordBObjCodec :: JSONCodec RecordB
recordBObjCodec = asObject "RecordB" $
  RecordB
    <$> recordBString =. field "string"
    <*> recordBDouble =. field "double"

Monadic Composition

Until this package, monadic composition did not have a valid implementation yet but did have an example implementation unparse-attoparsec, details below, that is outlined in Composing bidirectional programs monadically.f

int :: Biparser Int Int
int = do
  let printedInt n = show n ++ " "
  ds <- digits 'upon' printedInt
  return (read ds)

digits :: Biparser String String
digits = do
  d <- char 'upon' head
  if isDigit d then do
    igits <- digits 'upon' tail
    return (d : igits)
  else if d == ' ' then return " "
  else error "Expected digit or space"

Optics Composition

Optics composition can be used to create bidirectional programs by utilizing Iso (isomorphisms) and Prisms. These are ready to go and can be used to create composable bidirectional programs. Unfortunately at this point, optics do not support state very well so it's difficult to report errors with locations.

Other Existing Implementations

codec Simple bidirectional serialization

Codec makes it simple to write composable bidirectional serializers with a consistent interface.

Pros

Cons

Features we would like to also have

lens Lenses, Folds and Traversals

Pros

Cons

Features we would like to also have

boomerang Library for invertible parsing and printing

Specify a single unified grammar which can be used for parsing and pretty-printing

Pros

Cons

Features we would like to also have

Based On

Zwaluw

Combinators for bidirectional URL routing.

Pros

Cons

Features we would like to also have

sexp-grammar

Grammar combinator framework to build invertible parsing libraries for concrete syntax.

Contains invertible-syntax

Pros

Cons

Features we would like to also have

Based On

roundtrip Bidirectional (de-)serialization

Roundtrip allows the definition of bidirectional (de-)serialization specifications. The specification language is based on the ideas described in the paper Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing by Tillmann Rendel and Klaus Ostermann, Haskell Symposium 2010. This package does not provide concrete instances of the specification classes, see the packages roundtrip-string and roundtrip-xml instead. The package contains slightly modified code from Tillmann Rendel's partial-isomorphisms and invertible-syntax packages (Copyright (c) 2010-11 University of Marburg).

Pros

Cons

Features we would like to also have

Based On

unparse-attoparsec

This library provides a wrapper attoparsec to build programs that can be interpreted both as parsers and as printers.

Pros

Cons

Features we would like to also have

Boomerang

archived. Boomerang is a programming language for writing lenses—well-behaved bidirectional transformations—that operate on ad-hoc, textual data formats.

Pros

Cons

Features we would like to also have

Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing

Partial isomorphisms is the basis of this implementation data Iso a b = Iso (a -> Maybe b) (b -> Maybe a) and redefining <$>, <*>, <|> to suite partial Iso. The reader may be wondering why it is partial in both directions since it is strange that serialization could fail. This is due to the fact that <|> Alternative does not force the serialization of all possibilities at the type level so failure must be passed into runtime.

data Iso a b = Iso (a -> Maybe b) (b -> Maybe a)

class IsoFunctor f where
  (<$>) :: Iso a b -> f a -> f b

class ProductFunctor f where
  (<*>) :: f a -> f b -> f (a,b)

class Alternative f where -- lacks Applicative superclass
  (<|>) :: f a -> f a -> f a
  empty :: f a -- fails parsing and printing

I would have preferred that the definition of data Iso a b = Iso (a -> Maybe b) (b -> a) have been used instead to force successful serialization. This would however put more complexity in the types to enforce this. Also, due to using applicative the order of the parsing must coincide with the constructor arguments. With monadic parsers the parsed value can be assigned to a variable which disconnects the constructor argument order and the parser order. Type lists may be able to enforce: guaranteed serialization with Alternative, and unordered parsing.

Composing bidirectional programs monadically

Definitions

Other Sources