nextjournal / clojure-mode

Clojure/Script mode for CodeMirror 6
https://nextjournal.github.io/clojure-mode/
Eclipse Public License 2.0
158 stars 18 forks source link

Wrong regex for characters #22

Closed MrEbbinghaus closed 2 years ago

MrEbbinghaus commented 2 years ago

The current regex for characters doesn't match all Clojure characters.

Character { "\\" (std.asciiLetter | std.digit | "@")+ }

Clojure allows after the \:

Here you can find the code for the clojure.tools.reader implementation: https://github.com/clojure/tools.reader/blob/6bc1352113f7154b6e47b7941ab55f0c5e90517b/src/main/cljs/cljs/tools/reader.cljs#L140-L181

The (JavaScript) Regex for this is:

/\\(o[0-3]?[0-7]{1,2}|u[0-9a-fA-F]{4}|newline|space|tab|formfeed|backspace|return|\S)/

I don't think this fixes https://github.com/nextjournal/clojure-mode/issues/9, and you should definitely look into this issue.

mk commented 2 years ago

Hey, thanks for looking into this!

I think we'd need to fix this in the lezer grammar around https://github.com/lezer-parser/clojure/blob/172cf311376271a95986978e7041cb7dbd3fdd57/src/clojure.grammar#L114.

Feel free to open a PR against that if you'd like to look into this. Otherwise we'll look into it but think it will take us a while. Thanks again.

MrEbbinghaus commented 2 years ago

What about https://github.com/nextjournal/clojure-mode/blob/master/src/nextjournal/clojure_mode/clojure.grammar ?

mk commented 2 years ago

That’s a copy from lezer-clojure only used as a dev affordance but it would be best to fix it upstream including tests and then update it here in a second step.

MrEbbinghaus commented 2 years ago

@mk I opened a PR for this one: https://github.com/lezer-parser/clojure/pull/17

It also fixes https://github.com/nextjournal/clojure-mode/issues/9