rzk-lang / vscode-rzk

Visual Studio Code Extension(s) for Rzk proof assistant.
https://marketplace.visualstudio.com/items?itemName=NikolaiKudasovfizruk.rzk-1-experimental-highlighting
Other
8 stars 1 forks source link

Some characters are not highlighted as part of a token #10

Closed fredrik-bakke closed 1 year ago

fredrik-bakke commented 1 year ago

Although the subscript one character is not highlighted as part of the preceding token, the typechecker considers it part of it. This is because \b matches here when it shouldn't:

image

A couple of other characters with the same behavior are -+*.

To fix this, I suggest writing custom patterns for matching word boundaries instead of using \b. As a start, here are the ones I use for Agda code:

You probably don't want the hole or pragma parts, and also, maybe you don't want commas and some other characters in tokens. Hence I'll suggest the following:

  1. (?<=^|[\\s.,;{}()<>@\"\\\\_|]) for left boundaries
  2. (?=$|[\\s.,;{}()<>@\"\\\\_|]) for right boundaries
fredrik-bakke commented 1 year ago

While on topic, you seem to have defined capital sigmas as special symbols as well. Can these not be part of a token? If so I can't say I agree with that decision. For instance, we use them all the time in tokens in agda-unimath.

fizruk commented 1 year ago

Sigma can be a part of a token, please, refer to the LBNF file Syntax.cf in the rzk repository for the source of truth. Unfortunately, I can't always translate this properly into other syntax highlighting configs.

fizruk commented 1 year ago

In particular, see this line:

https://github.com/fizruk/rzk/blob/936c15a37f52d77fac436558b5fbc18e31ea4e40/rzk/src/Language/Rzk/Syntax.cf#L6

It means that a token (for a variable) currently can contain anything except the following symbols:

Additionally, symbols - ? ! . are not allowed to as the first symbol of a token.

fredrik-bakke commented 1 year ago

Thanks for the explanation!