tree-sitter-grammars / tree-sitter-markdown

Markdown grammar for tree-sitter
MIT License
414 stars 52 forks source link

Shortcut Link and Code Span Edge Case #28

Closed mtoohey31 closed 1 year ago

mtoohey31 commented 2 years ago

Hello! I ran into the following edge case with a combination of shortcut_link and code_span:

- `x[0]` is equivalent to `*x`

...is parsed as:

  list_item [5, 0] - [7, 0]
    list_marker_minus [5, 0] - [5, 2]
    paragraph [5, 2] - [6, 0]
      shortcut_link [5, 4] - [5, 7]
        link_text [5, 5] - [5, 6]
      code_span [5, 7] - [5, 27]
        code_span_delimiter [5, 7] - [5, 8]
        code_span_delimiter [5, 26] - [5, 27]

...when it should be:

  list_item [5, 0] - [7, 0]
    list_marker_minus [5, 0] - [5, 2]
    paragraph [5, 2] - [6, 0]
      code_span [5, 2] - [5, 8]
        code_span_delimiter [5, 2] - [5, 3]
        code_span_delimiter [5, 7] - [5, 8]

The smallest example I could come up with is:

`[a]`b`*c`

...which is parsed as:

paragraph [7, 0] - [8, 0]
  shortcut_link [7, 1] - [7, 4]
    link_text [7, 2] - [7, 3]
  code_span [7, 4] - [7, 7]
    code_span_delimiter [7, 4] - [7, 5]
    code_span_delimiter [7, 6] - [7, 7]

...instead of:

paragraph [7, 0] - [8, 0]
  code_span [7, 0] - [7, 5]
    code_span_delimiter [7, 0] - [7, 1]
    code_span_delimiter [7, 4] - [7, 5]

This is a very "edge case" kind of scenario, but I figured I should mention it.

PS: thanks for your work on this, I was waiting for a tree sitter markdown grammar and I attempted to write one myself a few months ago but gave up cause I couldn't figure it out 🤣, so I appreciate the effort this must've taken.

MDeiml commented 2 years ago

Thanks for reporting this. There are a lot of problems with this kind of nested code spans, emphasis, links, ...

I don't yet know how to solve this problem, since all of it relies on dynamic precedence which is hard to get right, but I'll see what I can come up with.

MDeiml commented 1 year ago

This is fixed with #90