jgm / skylighting

A Haskell syntax highlighting library with tokenizers derived from KDE syntax highlighting descriptions
189 stars 61 forks source link

A python lone string is highlighted as a comment #179

Closed has2k1 closed 7 months ago

has2k1 commented 7 months ago

Explain the problem. When a lone python string is highlighted, it is tagged as comment. i.e. class="co"

For comparison, the string is also valid in r and ruby and for them the highlight class is correct i.e. class="st"

Try Pandoc with highlighting set.

`"string"`{.py}

`"string"`{.r}

`"string"`{.rb}

Output

<p><code
class="sourceCode python"><span class="co">&quot;string&quot;</span></code></p>
<p><code
class="sourceCode r"><span class="st">&quot;string&quot;</span></code></p>
<p><code
class="sourceCode ruby"><span class="st">&quot;string&quot;</span></code></p>

Pandoc version? 3.1.9

jgm commented 7 months ago

I'm seeing the same behavior in the Kate editor. (Of course, in x = "hello" it's marked as a string.)

So, this may be intentional; at any rate, our skylighting library (jgm/skylighting) is correctly interpreting the KDE syntax definition.

If you like, you can investigate the python.xml syntax definition and/or submit a bug report to KDE's syntax highlighting framework.

https://github.com/KDE/syntax-highlighting/tree/master/data/syntax

jgm commented 7 months ago

I've transferred this to skylighting, which is the library that does the highlighting. But I think it should be closed, as the issue is upstream.

has2k1 commented 7 months ago

While it differs from the convention, this is deliberate.

A lone string is taken to be a docstring of an empty module and Kate intentionally marks docstrings as comments, no different from the # comments. It can be seen in the sample test file.

jgm commented 7 months ago

Ah yes, of course, I should have thought of that.