Closed JoAllg closed 2 years ago
@JoAllg:
however if I add another token rule
token Nat digit+ ;
above the "Int" rule, then the positive Number1
in my test is not recognized!
This is a general effect with lexical analysis: Number 1
matches both Int
and Nat
, so the ambiguity is resolved by priority. Since Nat
comes before Int
, it gets the priority for equally long matches. Number -5
only matches Int
, so it is still accepted.
If you want both Nat
and Int
, I recommend to define Int
as a category (via a grammar rule) and not as token
type, e.g.:
Pos. Int ::= Nat;
Neg. Int ::= "-" Nat;
PS: btw, why are
Integer
andDouble
defined as positive numbers? Seems strange to me or I do not know how you would use this definitions to work with negative numbers.
See above, the -
can be handled in the grammar.
It wasn't my design choice, but restricting to positive numbers seems more robust. Unary minus -x
can be awkward to have if there is also binary minus (E.g. 0 - x
). E.g., in GHC there is the somewhat weird asymmetry:
ghci> :t (- 1)
(- 1) :: Num a => a
ghci> :t (+ 1)
(+ 1) :: Num a => a -> a
Here, (+ 1)
is the function that adds 1
, but (- 1)
is the literal -1
.
thanks for the reply. I didn't realize that i itnroduced ambiguity in the definition.
So even if I remove the Nat
toke rule, the ambiguity will still exist with regards to the basic type Integer
, right?
Also thanks for the recomendation, I wanted to reduce the introduction of new tokens. But so be it :-)
So even if I remove the
Nat
toke rule, the ambiguity will still exist with regards to the basic typeInteger
, right?
Yes, but the Integer
definition will not be included in the generated lexer unless you mention Integer
in your grammar.
The advantage of built-in Integer
over self-defined Nat
is that in the syntax tree you get an actual number (Haskell type Integer
) where as for Nat
you would get a string that you have to convert into a number (newtype Nat = Nat String
).
consider the following defintion
test.cf
with a token rule named "Int":There is no problem with this definition, i.e
however if I add another token rule
token Nat digit+ ;
above the "Int" rule, then the positive Number1
in my test is not recognized! The negative number-5
is fine though.This does not seem to happen with other rules on top of the
Nat
rule, e.g.token Hex '0''x' ( digit | ["abcdefABCDEF"] )+ ;
PS: btw, why are Integer and Double defined as positive numbers? Seems strange to me or I do not know how you would use this definitions to work with negative numbers.