Closed fredrik-bakke closed 1 year ago
While on topic, you seem to have defined capital sigmas as special symbols as well. Can these not be part of a token? If so I can't say I agree with that decision. For instance, we use them all the time in tokens in agda-unimath
.
Sigma can be a part of a token, please, refer to the LBNF file Syntax.cf
in the rzk repository for the source of truth. Unfortunately, I can't always translate this properly into other syntax highlighting configs.
In particular, see this line:
It means that a token (for a variable) currently can contain anything except the following symbols:
\
;
,
#
\
"
]
[
)
(
}
{
>
<
|
`
\t`\n
\r
Additionally, symbols -
?
!
.
are not allowed to as the first symbol of a token.
Thanks for the explanation!
Although the subscript one character
₁
is not highlighted as part of the preceding token, the typechecker considers it part of it. This is because\b
matches here when it shouldn't:A couple of other characters with the same behavior are
-+*
.To fix this, I suggest writing custom patterns for matching word boundaries instead of using
\b
. As a start, here are the ones I use for Agda code:(?<=^|\\{\\!|\\{-\\#|[\\s.;{}()@\"])
for left boundaries(?=$|\\!\\}|[\\s.;{}()@\"])
for right boundariesYou probably don't want the hole or pragma parts, and also, maybe you don't want commas and some other characters in tokens. Hence I'll suggest the following:
(?<=^|[\\s.,;{}()<>@\"\\\\_|])
for left boundaries(?=$|[\\s.,;{}()<>@\"\\\\_|])
for right boundaries