jgm / skylighting

A Haskell syntax highlighting library with tokenizers derived from KDE syntax highlighting descriptions
189 stars 61 forks source link

Opening double quotes in sqlpostgresql code block are highlighted as comment #95

Closed bewuethr closed 4 years ago

bewuethr commented 4 years ago

Pandoc version: 2.9.2.1

To reproduce

Render a code block with syntax highlighting language sqlpostgresql with a double quoted string.

pandoc <<'EOF'
```sqlpostgresql
select 'foo' "bar";
```
EOF

Expected behaviour

Double quoted string is assigned class ot ("other"), as it does for sql:

<span class="ot">&quot;bar&quot;</span>

which renders like

sql double quotes

with default styling, or class st ("string"), as for sqlmysql:

<span class="st">&quot;bar&quot;</span>

which renders like

sqlmysql double quotes

Observed behaviour

The opening double quote is assigned class co ("comment"), the string and the closing double quote ot ("other"):

<span class="co">&quot;</span><span class="ot">bar&quot;</span>

which renders like

sqlpostgresql double quotes

with default settings, or more obviously incorrect, as

sqlpostgresql double quotes, my styling

with a styling where italics visibly change double quotes.

jgm commented 4 years ago

Moving to skylighting. Note that sql-postgresql.xml contains

        <DetectChar attribute="Comment" context="Identifier" char="&quot;"/>

Is this correct? If not, then this should be fixed upstream (in KDE syntax definitions).

bewuethr commented 4 years ago

The upstream syntax file is identical to what Skylighting uses, i.e., there nothing more recent in there than here, an indeed, also when using Kate, the highlighting is wrong:

image

The offending line

       <DetectChar attribute="Comment" context="Identifier" char="&quot;"/>

has been like this ever since it was added to the syntax highlighting repo in 2016 in this commit from a not further specified "upstream"; I'll see where and how I can report this to the KDE project then.

(I'm just assuming it's the offending line, but it matches what I observe.)

jgm commented 4 years ago

You could try writing to the author of the syntax definition - email is listed at the beginning of the xml file.

bewuethr commented 4 years ago

I'll try! The syntax file is ancient, it was initially added in 2002 to, back then, Kate (see history), and the line we think might be responsible for the incorrect quoting was added in 2003, so it's been like this for a long time.

Edit: actually, it looks like in this diff, the line

<DetectChar attribute="9" context="5" char="&quot;"/>

was incorrectly translated to

<DetectChar attribute="Comment" context="Identifier" char="&quot;"/>

The other lines with attribute="Comment" used to be attribute="8" everywhere else. I believe the number is an index into the <itemDatas> arrays, so attribute="9" should have become attribute="Identifier" instead of attribute="Comment". And context="5" has become context="Identifier" after all.

So, I'm guessing off-by-one error!

jgm commented 4 years ago

Makes sense. If you want to submit a PR here we can update it locally; but it would also be good to get it changed in KDE, since we periodically refresh from upstream and it's a pain if we get out of sync.

For an example of the kind of thing that would be needed, see skylighting-core/xml/bash.xml.patch. IF you just add a patch like this, produced by diff against the changes you would make to the xml file, it will be applied when e build.

bewuethr commented 4 years ago

Submitted as a KDE bug. I'll submit an upstream patch and a PR for Skylighting later this week.

bewuethr commented 4 years ago

KDE differential submitted.