tree-sitter-grammars / tree-sitter-markdown

Markdown grammar for tree-sitter
MIT License
374 stars 45 forks source link

Code block not recognized as one if there is trailing space on the closing fence #135

Open kirawi opened 4 months ago

kirawi commented 4 months ago

Describe the bug

Ref https://github.com/helix-editor/helix/issues/9678

Code example

'''c
''' 
int i = 0;

Replace the single quotes with backticks and int i = 0 will be highlighted as C.

Expected behavior

According to https://github.github.com/gfm/#fenced-code-blocks, trailing whitespace are ignored on the closing code fence.

Actual behavior

Closing fence is not recognized.

clason commented 4 months ago

For the record, this parser implements CommonMark Spec, with only some GFM extensions (that are optional but enabled by default). I would prefer to stay strict on this (and similar "softenings" of restrictions that make parsing easier).

clason commented 4 months ago

But CommonMark also specifies that whitespace after closing fences are to be ignored, so this should be fixed (here). PR welcome!

savetheclocktower commented 3 months ago

I can confirm that this also happens when the closing fence is followed immediately by EOF. That's an easier scenario to catch than trailing whitespace, so I might open a PR for that case specifically.

pokey commented 3 weeks ago

I can confirm that this also happens when the closing fence is followed immediately by EOF. That's an easier scenario to catch than trailing whitespace, so I might open a PR for that case specifically.

If that's easy, a fix there would be awesome, because I think that's probably a more common scenario than trailing whitespace, though obviously both would be nice to have

pokey commented 3 weeks ago

I would think both should be a fairly easy fix, no? I might be missing something, but it looks like we could just check for end of file or space in addition to \r and \n in https://github.com/tree-sitter-grammars/tree-sitter-markdown/blob/7fe453beacecf02c86f7736439f238f5bb8b5c9b/tree-sitter-markdown/src/scanner.c#L418