highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.3k stars 3.52k forks source link

(ini) Unquoted string value with embedded period incorrectly highlighted #4038

Open ghannington-Rocket opened 2 months ago

ghannington-Rocket commented 2 months ago

In an ini code block, unquoted string values with embedded periods are incorrectly highlighted.

I'm using highlight to specifically select the ini language, not auto-detect.

Example code:

UNQUOTED_STRING_WITH_EMBEDDED_PERIOD=TLSv1.2

Screenshot from the highlight.js demo (sorry, not a fiddle):

image

Expected behavior

For comparison, here's a screenshot from the prism.js "test drive" using ini highlighting, with "correct" (expected, desired) highlighting:

image

(In case you're wondering, "Why don't you use prism.js, then?" The docs framework I'm using currently uses highlight.js. I'm not in control of that docs framework. I like highlight.js just fine. Thank you!)

Analysis of highlight.js-generated HTML

Given the string value TLSv1.2, I can see that highlight.js groups the leading characters TLSv1. with the preceding equals sign (=), and identifies the trailing 2 as a number:

<code>
  <span class="hljs-attr">UNQUOTED_STRING_WITH_EMBEDDED_PERIOD</span>
  =TLSv1.
  <span class="hljs-number">2</span>
</code>

ini, not TOML

When I write "incorrectly highlighted", I acknowledge that I'm referring to the context of a specific ini dialect.

I understand that:

Still, I thought it was worth asking: is it possible to "fix" this "issue" - tweak the highlight.js ini language module - for this particular case, without breaking, or terribly overcomplicating, the "primary" TOML highlighting use case?

ghannington-Rocket commented 2 months ago

I acknowledge that, when you enclose the string value in quotes, as required by TOML:

QUOTED_STRING_WITH_EMBEDDED_PERIOD="TLSv1.2"

then highlight.js works perfectly (demo screenshot):

image

The corresponding highlight.js-generated HTML correctly distinguishes the entire string value from the preceding equals sign:

<code>
  <span class="hljs-attr">QUOTED_STRING_WITH_EMBEDDED_PERIOD</span>
  =
  <span class="hljs-string">"TLSv1.2"</span>
</code>

I suspect that the developer of the "TOML, also INI" language module might "invite" 🙂 me to develop a custom language module for "my" ini dialect.

My problem: I suspect that it might be a challenge convincing the owners of the docs framework that I'm using to integrate any such "home-grown" language module. I think they'd be far more amenable to simply upgrading to a more recent version of the bundled ini language module.

ghannington-Rocket commented 2 months ago

I should also acknowledge, before anyone else points it out, that this "issue" isn't, or isn't only, about embedded periods in unquoted string values; it's about unquoted string values, er, period. 🙂

It's just that an embedded period in an unquoted string value draws attention to (highlights! 🙂) the fact that highlight.js has issues distinguishing any unquoted string value from the preceding equals sign.

For example:

UNQUOTED_STRING=ABC123
QUOTED_STRING="ABC123"

Demo rendering:

image

In this case, the entire unquoted string value is not distinguished from the preceding equals sign.

That's still an issue, but the effect is, perhaps arguably, less bad than the "mixed highlighting" of an unquoted string value that contains an embedded period.