highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.6k stars 3.58k forks source link

Line-breaks after JS, TS, and CSS comments are being stripped #3951

Closed mlaposta closed 8 months ago

mlaposta commented 9 months ago

Describe the issue/behaviour that seems buggy Highlighting of comments in JavaScript, TypeScript, and CSS blocks don't work correctly: line-breaks after the commented line seem to get removed.

More specifically, it's as if line-breaks after semi-colons and opening or closing brackets are respected, otherwise, line-breaks are removed.

Example output:

Screenshot 2023-12-28 at 09 24 36

Sample Code or Instructions to Reproduce

// Test comment on line with no code
const test = 'test';
const another_test = "this line has no ending semi-colon"

if (test) {              // Test comment on line with code
  console.log('test')
}

Expected behaviour line-breaks after comments and code that ends without semi-colons or brackets should be respected

Additional context I haven't tested other languages, so I'm not sure what other langs have the same issue. I saw there used to be a "setBR": true option for the configure method, but that doesn't seem to exist anymore.

I tried to check if there were any other config options I might be missing, but didn't find any. Am I missing something?

I'm calling highlight.js via markdown-it, as such:

const md = markdownIt({
  html: true,
  xhtmlOut: true,
  linkify: false,
  typographer: true,
  breaks: true,
  highlight: function (str, lang) {
    if (lang && hljs.getLanguage(lang)) {
      try {
        return (
          '<pre class="hljs"><code>' +
          hljs.highlight(str, { language: lang }).value +
          '</code></pre>'
        );
      } catch (__) {}
    }

    return (
      '<pre class="hljs"><code>' + md.utils.escapeHtml(str) + '</code></pre>'
    );
  },
});

If I remove hljs from the equation, the comments are treated fine (without any highlighting of course). So the issue definitely appears to be somewhere in the hljs code.

joshgoebel commented 9 months ago

line-breaks after the commented line seem to get removed.

Please specify exactly what you mean by "line breaks". It sounds like you are referring to <br> tags inside your textual content?

mlaposta commented 9 months ago

No html tags used - I posted an example in my original post under the Sample Code or Instructions to Reproduce section.

By line break I mean literally having a new line.

Here's another example, without any formatting in case it makes it clearer:


// this is a comment

let someVar = "this is on another line";


The above, if I had it in a code-block with the "js" or "ts" identifier, and parsed by HLJS, would result in both of those lines on the same line, not on 3 lines like showing above, just like in the "Example output:" I supplied in the original post.

Hope that helps.

joshgoebel commented 9 months ago

If I remove hljs from the equation, the comments are treated fine

Please remove markdown-it from the equation... a jsfiddle that reproduces this with ONLY a raw code block (so we can see the actual content) and hljs would be helpful. You can use this template: https://jsfiddle.net/ajoshguy/cjhvre2k/26/

We need to be shown the bug is in our core library.


That said you could debug your own code further - In your example you'd probably want to drop in some debugging console.logs to find out what str is being passed to our library itself (possibly after markdown-it has mucked with it) and what hljs.highlight(str, { language: lang }).value is returning before any other processing happens...

I think if you did that you'd se clear "broken" line encodings... perhaps markdown is converting all the \n to <br>... which isn't going to work for us... Our library only works with \n style newlines...

mlaposta commented 9 months ago

Thanks for your diligence Josh, you were correct that the issue was not related to hljs.

I had to do some deep tracing, but finally pin-pointed where the issue was coming from, and in the end it was due to the react-html-parser lib I was using for some needed HTML element swapping.

I initially made the false deduction that hljs was to blame due to the line breaks remaining intact after removing hljs, but it's definitely the react-html-parser lib that was doing it.

Thanks again - you can close this one off!