vim-pandoc / vim-pandoc-syntax

pandoc markdown syntax, to be installed alongside vim-pandoc
MIT License
425 stars 61 forks source link

Code highlighting fails in combination with reference link #299

Open SuperFluffy opened 4 years ago

SuperFluffy commented 4 years ago

The following example fails to highlight correctly. It seems somehow related to having verbatim text within a code block:

This is a link text: [`some link text`]. This is a another inline code block: `fn f(argument) -> output`.
Pandoc highlights all of it.

[`some link text`]: www.website.com

This is the same link text: [`some other text`]. This is a another inline code block, but with a linebreak:
 `fn f(argument) -> output`. Pandoc highlights it correctly.

[`some other text`]: www.othersite.com

Vim renders it using the latest vim-pandoc-syntax like in the image below. As you can see, the first paragraph is entirely highlighted, starting from the link. If I introduce a linebreak before the next code block, it renders correctly.

It can also be fixed by inserting the URL inline, i.e. changing [`some link text`] to [`some link text`](www.website.com). So the problem seems to be also related to reference links.

2019-10-29-112732_2560x1440_scrot

SuperFluffy commented 4 years ago

In general, there seems to be an issue with code highlighting of inline code blocks within links. For example, type the following into a new vim buffer (with :set filetype=pandoc) to see a difference:

<!-- `World` will not be highlighted -->
Hello [`World`]

<!-- `World` will be highlighted just by placing a punctuation mark after the link -->
Hello [`World`].

Note how github's parser (in the example right above) seems to do the right thing.

joelostblom commented 4 years ago

I encountered this today. In addition to punctuation marks, a space followed by a left parenthesis, a left bracket, or a backslash will properly highlight the reference link. See screenshot below.

image

This is on nvim 0.4.3, with vim-pandoc and vim-pandoc-syntax updated today, and with a clean vi ~/.config/nvim/init.vim:

call plug#begin()
Plug 'vim-pandoc/vim-pandoc'
Plug 'vim-pandoc/vim-pandoc-syntax'
call plug#end()
sadid commented 4 years ago

I think it's also the case for footnotes as well. Here is an example:

Here is a [^Broken] footnote due to space after and here is [^another ft].

[^Broken]: The syntax doesn't work
[^another ft]: This works good bcz of "."

and the result: 2020-03-01-221224_936x132_scrot

bpj commented 4 years ago

In general, there seems to be an issue with code highlighting of inline code blocks within links.

Yes there is. IMHO the best solution is probably not to highlight links containing code. If links are implemented as a match rather than as a region it might be possible to add the backtick in a negative char class or negative lookahead. Definitely better than what we have now!

tnyeanderson commented 1 year ago

I ran into this issue and started looking into it. It looks like the skip matcher is attempting to handle example 524 and 536 of the commonmark spec, which demonstrates that inline code should take precedent over link definitions. Unfortunately it is not matching properly as it does not account for code which is completely contained within a link text. Here is the breakdown of the skip matcher:

" THIS MEANS SKIP EITHER:
"   1) Text that contains ]] but don't capture the second ]
"   2) Text contained within backticks as long as there is not an unescaped ] in that text
" skip:  /\(\\\@<!\]\]\@=\|`.*\\\@<!].*`\)/
"         ||||||||||||||||||||||||||||||^^---end capture group
"         |||||||||||||||||||||||||||||^---literal backtick
"         |||||||||||||||||||||||||||^^---wildcard
"         ||||||||||||||||||||^^^^^^^---match unescaped ]
"         ||||||||||||||||||^^---wildcard
"         |||||||||||||||||^---literal backtick
"         |||||||||||||||^^---OR
"         ||||||||||^^^^^---match literal ] but don't include it in the capture
"         ||^^^^^^^^---match unescaped ]
"         ^^---begin capture group

The relevant portion of the regex:

`.*\\\@<!].*`
||||||||||||^---literal backtick
||||||||||^^---wildcard
|||^^^^^^^---match unescaped ]
|^^---wildcard
^---literal backtick

Here is the line where it is used:

syn region pandocReferenceLabel matchgroup=pandocOperator start=/!\{,1}\\\@<!\^\@<!\[/ skip=/\(\\\@<!\]\]\@=\|`.*\\\@<!].*`\)/ end=/\\\@<!\]/ keepend display

I've tried some quick experiments (non-greedy wildcards and adding a negative lookbehind for a second backtick) but these didn't work on my first attempt and I had stop shaving yaks and continue with what I was actually working on. Will try to get a PR for a fix in soon.

notnotrandom commented 1 year ago

I would mention that, although it is not standards compliant, a workaround that fixed this issue for me in practice, is to delete the skip= part in the pandocReferenceLabel line, i.e., using that line as follows:

syn region pandocReferenceLabel matchgroup=pandocOperator start=/!\{,1}\\\@<!\^\@<!\[/ end=/\\\@<!\]/ keepend display

Hope that might be of help, while we wait for a proper fix...