Open carpie opened 5 years ago
Thanks for this and sorry for the late reply and review. It kinda makes sense... the rationale for only allowing certain characters is that tokens could then be used as Python-level identifiers and to avid possibly collision with short-form operators (~+|
... etc). In practice this is not big requirement IMHO. In fact in https://github.com/nexB/license-expression/ we accept any characters in tokens and have implemented a few custom tokenizers too.
When using UUIDs in tokens, the tokens are rejected because of the
-
character in them. I can subclassBooleanAlgebra
and overridetokenize
but it is a lot of duplication for allowing an additional character in the token. It would be nice if one could specify the allowable character set.