ianh / owl

A parser generator for visibly pushdown languages.
MIT License
746 stars 21 forks source link

token ' ' can't be used as both a whitespace and a normal keyword #40

Closed modulovalue closed 1 year ago

modulovalue commented 1 year ago

Consider the grammar of smileys with an arbitrary long space as a nose:

face = eyes nose mouth
eyes =
  ':' : normal
  ';' : winking
nose =
  ' '* : space
mouth =
  ')' : smile

The playground says:

error: token ' ' can't be used as both a whitespace and a normal keyword

    ' '* : space
    ~~~         

It's not quite clear to me why a nose can't be a space. Are space characters like ' ' used as whitespace by default? Is there a way to disable any default whitespace rules?

modulovalue commented 1 year ago

This is blocking #38

base = slsq | sldq | startmlsq | startmldq
slsq = "'"
sldq = '"'
startmlsq = "'''" (("\\"? " ")* "\\"? ("\n"|"\r"|"\r\n"))?
startmldq = '"""' (("\\"? " ")* "\\"? ("\n"|"\r"|"\r\n"))?
error: token ' ' can't be used as both a whitespace and a normal keyword

  startmlsq = "'''" (("\\"? " ")* "\\"? ("\n"|"\r"|"\r\n"))?
                            ~~~                             
error: token '
' can't be used as both a whitespace and a normal keyword

  startmlsq = "'''" (("\\"? "_")* "\\"? ("\n"|"\r"|"\r\n"))?
                                         ~~~~               
ianh commented 1 year ago

You can configure whitespace handling using .whitespace (as described here https://github.com/ianh/owl/blob/master/doc/grammar-reference.md#configuring-whitespace). Owl isn’t really designed to work as a “scannerless parser”, though, so you may run into more difficulty here. The intended way to add new token types is the .token feature, which requires adding some C code to recognize the token.

modulovalue commented 1 year ago

Thank you for your help!

The following works as expected:

face = eyes nose mouth
eyes =
  ':' : normal
  ';' : winking
nose =
  ' ' : space
mouth =
  ')' : smile

.whitespace

(I apologize for not reading the manual!)