BetterThanTomorrow / calva

Clojure & ClojureScript Interactive Programming for VS Code
https://marketplace.visualstudio.com/items?itemName=betterthantomorrow.calva
Other
1.6k stars 213 forks source link

Textmate grammar mishandles `|` pipe symbol in identifiers #1973

Open skylize opened 1 year ago

skylize commented 1 year ago

The grammar's handling of | in identifiers does not align with the Reader, nor with the stance held by the Clojure Team regarding its validity.

Officially | is not a valid character in identifiers, as it is reserved for future (and already partially planned) use by the language. I personally hold the position that Calva should treat it as an invalid character, and indicate such in the grammar. This would help to prevent accidental usage by Calva users, reducing friction to valuable future language upgrades.

A reasonable alternative position is that we should treat | just like the Reader does. But current behavior does not do that either:

; As expected, treated as symbol `foo`.
foo 

; Reader sees symbol `foo|bar`. 
; Calva sees symbol `foo`, unknown char, symbol `bar`
foo|bar 

; As expected, treated as keyword `:foo`.
:foo

; Reader sees keyword `:foo|bar`
; Calva sees unknown char `:`, symbol `foo`, unknown char `|`, symbol `bar`
:foo|bar
PEZ commented 1 year ago

It's a goal with the grammar and parsing in Calva that the Reader hold the truth. So this is a bug.

It only really affects the look of that symbol, and only if clojure-lsp is not serving. Nothing in Calva's behaviour is dependent on these tokens. For that we have other grammar. That grammar agrees with the Reader on how | should be treated.

skylize commented 1 year ago

and only if clojure-lsp is not serving

Not sure what you mean by this. Even with LSP running, a keyword gets recolored to look like a symbol.

PEZ commented 1 year ago

This is what I mean:

Inspecting the tokenization we can see it spelled out:

image
skylize commented 1 year ago

Hmmm. That is different from the behavior on my system. Clojure LSP serving does not fix the issue like your screenshot shows.

PEZ commented 1 year ago

You probably have semantic tokens disabled.

skylize commented 1 year ago

Verified what you described, that semantic tokens override the Textmate grammar with the correct Reader interpretation. My theme for some reason disables it on my behalf if editor.semanticHighlighting.enabled is set to configuredByTheme. So this issue is limited to the Textmate grammar (which also applies if LSP is not running).

That would seem to reduce the already fairly low priority. Except I also found the same issue with & which is considered legal by both the Reader and the Clojure Team. While obviously not the most pressing concern, it would definitely still be preferable if the Textmate grammar could properly deal with legal non-word-characters in identifiers.

:foo&bar  ; Textmate scopes color as 2 symbols and an unknown char
PEZ commented 1 year ago

We should fix both. It's probably pretty easy.