orbitalquark / scintillua

Scintillua enables Scintilla lexers to be written in Lua, particularly using LPeg. It can also be used as a standalone Lua library for syntax highlighting support.
https://orbitalquark.github.io/scintillua
MIT License
52 stars 22 forks source link

Markdown lexer: two blockcode errors #93

Open Disonantemus opened 1 year ago

Disonantemus commented 1 year ago

Fenced blockcode with tilde (commonmark spec) it's not supported

Fixed this (I'm not a programmer and i don't know how to PR), adding this line after local code block =

local code_block_tilde = lexer.range(lexer.starts_line('~~~'), '\n~~~' * hspace^0 * ('\n' + P(-1)))

And then changed this line to include tilde block type:

lex:add_rule('block_code', token('code', code_line + code_block + code_block_tilde + code_inline))

Blockcode (line, not fenced) error (edge case with lists inside)

Markdown example (indent using Tabs or 4 spaces to get blockcode line):

    This
    * Whole
    + Paragraph
    - Should be
    a Blockquote

But I get blockcode only first & last line, I think the other lines are recognized as list.

I've tried 2 fix this, but it's tied with lists and fixing one damage the other.


Tested expected behavior with Github and:

orbitalquark commented 1 year ago

Thanks for the report. I'll look into this when I have some time.

orbitalquark commented 1 year ago

I've implemented ~~~ code blocks here: https://github.com/orbitalquark/scintillua/commit/fc89283092d6f42f16674b6e165d8c5104acb662

As for the second issue you pointed out, there are two problems:

  1. The issue you pointed out with only the first and second lines being recognized as code blocks.
  2. If you remove the This line, the last line should be recognized as part of the list on the previous line. Instead it's recognized as a code block.

I spent some time investigating and trying things out, but I cannot figure out how to solve this. Markdown is very weird. It's possible another lexer rewrite will be needed to correct this, but I honestly don't know how I'd approach that. I'm afraid I'll have to leave this unresolved for now :(

A workaround for your case is to wrap it all in a fenced code block (backticks or tildes). I don't have a workaround for the second case I brought up.

rgieseke commented 1 year ago

FWIW, there is Lua Markdown parser implementation from John McFarlane of Pandoc/Commonmark. Maybe it could be adapted as a Textadept lexer.

https://github.com/jgm/lunamark/blob/master/lunamark/reader/markdown.lua