haskell / alex

A lexical analyser generator for Haskell
https://hackage.haskell.org/package/alex
BSD 3-Clause "New" or "Revised" License
297 stars 82 forks source link

Add functionality for using typeclasses in lexers #68

Closed emc2 closed 8 years ago

emc2 commented 9 years ago

This pull request adds functionality for generating lexers that are amenable to the use of type classes. There are two primary usage patterns for this: 1) Use of type classes in the token types and 2) Use of type classes in monads.

This patch adds three new directives to the lexer file format: %token, %typeclass, and %action. %action is allowed only if no wrapper directive is given, and specifies the type of actions. %token is allowed only if a directive is given, and specifies the type of the tokens (since the wrapper type implies a particular type for actions). If these directives are given, then type signatures will be generated in the generated code.

Additionally, the %typeclass directive is allowed if %action or %token is given, and specifies one or more type classes that will be present in the type signatures.

This patch handles all documented wrappers, as well as lexers with no wrapper, including the bytecode variations. It makes a minor alteration to the alex_accept array, splitting out the actions into a separate array (this is necessary to avoid an ambiguous type variable that arises when both type classes and contexts are used).

With this patch, it is possible to generate a lexer with the following action:

literal :: Read s => String -> Token s literal = Literal . read

With the current version of Alex, this would generate compile errors. It is also possible to do something like this (with a home-made monadic lexer):

badChars :: MonadLexerErrors m => AlexPosn -> String -> m Token badChars pos chars = do reportError $ "invalid characters " ++ chars ++ " at " show pos alexMonadScan

This use of monad type classes is particularly useful, as it allows the same lexer to work in a wide variety of contexts.

emc2 commented 9 years ago

This patch has been confirmed to work correctly with a complex Lexer specification. The source can be found here: https://github.com/emc2/saltlang/blob/master/src/library/Language/Salt/Surface/Lexer.x

simonmar commented 9 years ago

This looks like a pretty epic patch, thanks for doing all the work. We also need updates to the docs for the new directives, could you do that please?

emc2 commented 9 years ago

Sure. I have no experience with docbook, though. Still, I'll give it a shot.

emc2 commented 9 years ago

Added a first draft of the documentation for the new features

emc2 commented 9 years ago

Rebased against new commits.

simonmar commented 9 years ago

One more thing - could you add tests for the new features please?

emc2 commented 9 years ago

Added tests.

emc2 commented 9 years ago

Are there any remaining issues with this PR?

emc2 commented 8 years ago

Rebased changes to new commits

simonmar commented 8 years ago

Thanks :)