ollef / Earley

Parsing all context-free grammars using Earley's algorithm in Haskell.
BSD 3-Clause "New" or "Revised" License
363 stars 24 forks source link

expose internals #8

Closed sboosali closed 9 years ago

sboosali commented 9 years ago

a user might want to observe sharing themselves (e.g. manual annotations, http://hackage.haskell.org/package/data-reify, their own monad, etc.), and pass it down to the parser.

We can put it in a .Internal module.

ollef commented 9 years ago

Hello and welcome!

It sounds like that should be possible already using the Grammar interface (note that the Grammar module already exposes its internals if you import Text.Earley.Grammar). What is the problem with that approach?

Exposing the internals is sometimes a good idea, but note that the internals exposed in Parser are only used while parsing and are very implementation-specific, why I'd like to get some more information about your use-case before I merge your changes. :)

Thanks!

sboosali commented 9 years ago

My use case is to transform a free alternative into an Earley Prod. So I'm wrapping the types in this package. The sharing is observed by a function that caches another function by reference (via StableName, like https://hackage.haskell.org/package/vault or like http://packdeps.haskellers.com/reverse/data-reify ). See https://github.com/sboosali/commands-core/blob/observed-sharing/sources/Data/RefCache.hs#L117 if interested.

it would replace Earley's grammar, looking maybe like this (probably doesn't typecheck, and still learning about Rank2 types and stuff, but):

observeSharing :: (forall x. f x -> g x) -> Alter f a -> IO (Alter g a)

shareEarleyProduction :: Symbol a -> ST z (E.Prod z e t a)
shareEarleyProduction = \case
 NonTerminal rhs -> do
  let p = getEarleyProduction rhs
  c <- newSTRef =<< newSTRef mempty
  nr <- newSTRef Nothing
  return$ NonTerminal (Rule p nr c) (Pure id)
 Terminal t -> _

induceEarleyParser :: ListLike i t => (Alter Symbol a) -> i -> ST z (E.Result z e i a) 
induceEarleyParser ps xs = do
 qs <- observeSharing shareEarleyProduction ps
 s <- initialState (runAlter id qs)
 parse [s] [] [] (return ()) [] 0 xs

where Alter is the free alternative, and Symbol holds some grammatical primitives like terminals or wildcards, as well as a right-hand side that should be shared; observeSharing calls cacheByRef.

e.g. in the grammar:

root :: Alter Symbol Root
root = ReplaceWith <$> t"replace" <*> phrase <*> t"with" <*> phrase
   <|> Phrase      <$> phrase
   <|> ...

phrase = _

with this (non-monadic) flavor of observed sharing, each phrase (rather, the Earley Prod induced by it) shares the same STRefs, after calling induceEarleyParser on root.

I tried a simpler API with a MonadFix that observes sharing (like Earley's Grammar), but I couldn't convert it to an Earley Grammar.

I hope that made sense, I'm still figuring this all out.

p.s. here's some exposing-internals-on-general-principles propaganda https://www.reddit.com/r/haskell/comments/2uoton/edward_kmett_encapsulation_vs_code_reuse/ :-)

ollef commented 9 years ago

OK. I still think that should be possible just with Grammar but either way it won't hurt to expose the internals.

I'd be interested to see what you come up with later. :)

sboosali commented 9 years ago

thanks!

On Thursday, July 2, 2015, Olle Fredriksson notifications@github.com wrote:

Closed #8 https://github.com/ollef/Earley/issues/8.

— Reply to this email directly or view it on GitHub https://github.com/ollef/Earley/issues/8#event-346282064.

(this message was composed with dictation: charitably interpret typos)Sam Boosalis