mrkkrp / parser-combinators

Lightweight package providing commonly useful parser combinators
Other
52 stars 15 forks source link

[Proposal] Add manyEndingWith #56

Open lsmor opened 1 year ago

lsmor commented 1 year ago

Hi! may you consider adding manyEndingWith (name subject to change)? The code would be like:

-- copy paste from manyTill_. It takes a parser p and a finalizer end. It returns the list of 
-- all parsed elements with p and(!) the element parsed with end
manyEndingWith :: MonadPlus m => m a -> m a -> m [a]
manyEndingWith p end = go id
  where
    go f = do
      done <- optional end
      case done of
        Just done' -> return $ f [done']
        Nothing -> do
          x <- p
          go (f . (x :))

This is particulary usefull when parsing the eof. For example in megaparsec this code will hang forever

-- This hangs forever. But I don't know why.
my_parser = (True <$ symbol ";") <|> (True <$ symbol "|") <|> (False <$ eof)
parse (many my_parser) "" ";|"
> hangs forever...

Ideally the last example could return Right [True, True, False], but I think it isn't possible with the current combinators. With the new combinator the above example could be rewritten as

my_parser = (True <$ symbol ";") <|> (True <$ symbol "|")
parse (manyEnding my_parser (False <$ eof)) "" ";|"
> Right [True, True, False]

I know manyTill_ exists but, It returns m ([a], end), forcing you to append end at the end of the list (if a ~ end), which is inefficient for linked lists.

lsmor commented 5 months ago

Hi there!!

Are you accepting PRs? from the conversation in #31 It seems that you prefer to keep the library smaller :).

mrkkrp commented 5 months ago

This case seems to be quite particular. I agree that adding an element to the end of the linked list in inefficient, but then there could be many ways to avoid building a list at all (e.g. by counting the number of symbols matched before eof instead of building a list), or to avoid eof being represented by an element in the list. All this seems to belong to the area of creativity that concerns the code of a particular parser, not a library of common combinators.