Allowing other characters in a token

bastikr / boolean.py

Implements boolean algebra in one module.

BSD 2-Clause "Simplified" License

76 stars 34 forks source link

Allowing other characters in a token #86

Open carpie opened 5 years ago

carpie commented 5 years ago

When using UUIDs in tokens, the tokens are rejected because of the - character in them. I can subclass BooleanAlgebra and override tokenize but it is a lot of duplication for allowing an additional character in the token. It would be nice if one could specify the allowable character set.

pombredanne commented 4 years ago

Thanks for this and sorry for the late reply and review. It kinda makes sense... the rationale for only allowing certain characters is that tokens could then be used as Python-level identifiers and to avid possibly collision with short-form operators (~+| ... etc). In practice this is not big requirement IMHO. In fact in https://github.com/nexB/license-expression/ we accept any characters in tokens and have implemented a few custom tokenizers too.