eclipse-archived / ceylon-web-ide-backend

The Ceylon Web IDE
http://try.ceylon-lang.org/
Apache License 2.0
25 stars 11 forks source link

Syntax highlighting doesn't recognize nested multi-line comments #56

Closed lucaswerkmeister closed 10 years ago

lucaswerkmeister commented 11 years ago

Minor visual issue: In try.ceylon-lang.org,

/*
 * /*
 *  *  nested comment
 *  *  print("doubly commented");
 *  */
 *  print("commented");
 */
print("not commented");

only prints "not commented", but the last three lines of code are shown in black instead of only the last line.

Error might be roughly line 191 in ceylon.js.

(Apparently, GitHub also offers a "ceylon" syntax highlighting... with the same error. Is this also provided by this project?)

gavinking commented 11 years ago

The problem, I guess, is that both syntax highlighters, if I understand correctly, are based around regexes, and it is a well-known limitation of regexes, that you can't match nested parens (or nested /* ... */s).

(This is one of the reasons I'm so hostile to regexes: they can't handle the most basic thing in parsing.)

Actually, this might not be true for pigments, which is what github uses. @chochos WDYT?

lucaswerkmeister commented 11 years ago

pygments uses regexes for comments: here (EDIT: I've filed a ticket for pygments here.)

However, I don't think that ceylon.js does: If you look at the line I linked (ceylon.js#191), you'll see what I think is a proper little parser that simply skips over everything that's not a / with a prior *; I think it should be possible to get it to keep track of the "comment level" (increase a counter on every ch == '*' && last == '/' and only set state.tokenize = jsTokenBase if counter is zero/one).

gavinking commented 11 years ago

Ah. I must admit I had never seen that before. I assumed it was using the same syntaxhighlighter we use on the rest of the website and in ceylondoc. Which raises the question of why don't we use that impl on the rest of the website?

chochos commented 11 years ago

CodeMirror has its own plug-in for syntax highlighting. I didn't even know there was already js code for highlighting in the website, it might be possible to use that implementation and the plugin would only need to be the interface between that impl and what CodeMirror passes/expects.

gavinking commented 11 years ago

But it sounds like CodeMirror is actually doing a better job.

chochos commented 11 years ago

well no because CodeMirror is the one used in try.ceylon-lang.org which is the issue here...

chochos commented 11 years ago

pygments is written in python. We submitted the Ceylon syntax highlighter, based on the Java/Groovy syntax highlighter (and parts on the python highlighter as well, for the multiline strings). I wonder if the same happens for those languages:

/*
 * /*
 *  *  nested comment for JAVA
 *  *  System.out.print("doubly commented");
 *  */
 *  System.out.print("commented");
 */
System.out.print("not commented");
/*
 * /*
 *  *  nested comment for GROOVY
 *  *  print("doubly commented");
 *  */
 *  print("commented");
 */
print("not commented");
lucaswerkmeister commented 11 years ago

I wonder if the same happens for those languages:

Java doesn't support nested multi-line comments, so for them this behavior is totally fine. I don't know about Groovy. EDIT: If I'm reading their ANTLR grammar right, Groovy doesn't support nested multi-line comments either.

lucaswerkmeister commented 10 years ago

I just noticed that nested comments also confuse the auto-indenter.

lucaswerkmeister commented 10 years ago

Thanks!

lucaswerkmeister commented 10 years ago

Just in case anyone is watching this issue because they’re interested in all Ceylon syntax highlighters, the pygments devs just fixed this as well: issue, commit.

chochos commented 10 years ago

Awesome! thanks for the info!