Closed lucaswerkmeister closed 10 years ago
The problem, I guess, is that both syntax highlighters, if I understand correctly, are based around regexes, and it is a well-known limitation of regexes, that you can't match nested parens (or nested /* ... */
s).
(This is one of the reasons I'm so hostile to regexes: they can't handle the most basic thing in parsing.)
Actually, this might not be true for pigments, which is what github uses. @chochos WDYT?
pygments uses regexes for comments: here (EDIT: I've filed a ticket for pygments here.)
However, I don't think that ceylon.js
does: If you look at the line I linked (ceylon.js#191), you'll see what I think is a proper little parser that simply skips over everything that's not a /
with a prior *
; I think it should be possible to get it to keep track of the "comment level" (increase a counter on every ch == '*' && last == '/'
and only set state.tokenize = jsTokenBase
if counter is zero/one).
Ah. I must admit I had never seen that before. I assumed it was using the same syntaxhighlighter we use on the rest of the website and in ceylondoc. Which raises the question of why don't we use that impl on the rest of the website?
CodeMirror has its own plug-in for syntax highlighting. I didn't even know there was already js code for highlighting in the website, it might be possible to use that implementation and the plugin would only need to be the interface between that impl and what CodeMirror passes/expects.
But it sounds like CodeMirror is actually doing a better job.
well no because CodeMirror is the one used in try.ceylon-lang.org which is the issue here...
pygments is written in python. We submitted the Ceylon syntax highlighter, based on the Java/Groovy syntax highlighter (and parts on the python highlighter as well, for the multiline strings). I wonder if the same happens for those languages:
/*
* /*
* * nested comment for JAVA
* * System.out.print("doubly commented");
* */
* System.out.print("commented");
*/
System.out.print("not commented");
/*
* /*
* * nested comment for GROOVY
* * print("doubly commented");
* */
* print("commented");
*/
print("not commented");
I wonder if the same happens for those languages:
Java doesn't support nested multi-line comments, so for them this behavior is totally fine. I don't know about Groovy. EDIT: If I'm reading their ANTLR grammar right, Groovy doesn't support nested multi-line comments either.
I just noticed that nested comments also confuse the auto-indenter.
Thanks!
Awesome! thanks for the info!
Minor visual issue: In try.ceylon-lang.org,
only prints "not commented", but the last three lines of code are shown in black instead of only the last line.
Error might be roughly line 191 in
ceylon.js
.(Apparently, GitHub also offers a "ceylon" syntax highlighting... with the same error. Is this also provided by this project?)