highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.31k stars 3.52k forks source link

Uppercase alias ? (can't set "C#" alias for class="language-C#") #3967

Closed softlion closed 4 months ago

softlion commented 5 months ago

I have that html code imported into my html from github:

<code class="language-C#">
 ...
</code>

and that local html:
...
    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/csharp.min.js"></script>
    <script>
    hljs.registerAliases('C#', { languageName: 'csharp' });
    hljs.highlightAll();
    </script>

It looks like the C# alias can't be used in uppercase, The source code of highlightAll shows that is makes all alias lower case before adding them to the alias collection.

The result is, C# code is not detected. On the rendered HTML I see the hljs class and the hljs-c class.

Is there a workaround ?

joshgoebel commented 5 months ago

A few notes:


But this is trivial to implement outside the library... if you want to do this just write your own version of highlightAll wrapper and use data tags instead of class names:

<code data-language="C#">
 ...
</code>

Of course you'll need your own mapping table/function to convert C# into csharp... so your function loops, converts your "pretty names" back to actual aliases, then calls highlightElement... done.

Flayms commented 5 months ago

I just stumbled into a similar / same problem. Here it doesn't have much to do with uppercase. even

<code class="language-c#">
 ...
</code>

gets converted into

<code class="language-c# hljs language-c" data-highlighted="yes">
...
</code>

Registering 'c#' or 'language-c#' neither solves the problem.

I went a bit through the code and 'language-* takes precedence over non-prefixed class names'. The regex which is designed for that:

languageDetectRe: /\blang(?:uage)?-([\w-]+)\b/i

uses \b which breaks at the # of c# and thus detects language-c instead. This could be resolved by changing the reg into smth like

/(?<=^|\s)lang(?:uage)?-([\w#-]+)(?=\s|$)/i

which would account for the # and also detect word bounds differently.

joshgoebel commented 5 months ago

Again, language-c# (upper or lowercase) is not a valid CSS class name, and hence not supported on purpose.