tree-sitter-grammars / tree-sitter-markdown

Markdown grammar for tree-sitter
MIT License
374 stars 45 forks source link

Detect empty pipe_table_cell #112

Open mpasa opened 11 months ago

mpasa commented 11 months ago

I don't think it's part of the spec in any way for EXTENSION_PIPE_TABLE, at least as shown here, but I usually find myself using markdown tables with empty cells. I use treesitter to navigate them (jumping to the next pipe_table_cell, but these empty cells are not detected in any way. For example:

| a | b |
| - | - |
| 1 |   |
| 3 | 4 |

The tree for this table is:

  pipe_table [0, 0] - [4, 0]
    pipe_table_header [0, 0] - [0, 9]
      pipe_table_cell [0, 2] - [0, 4]
      pipe_table_cell [0, 6] - [0, 8]
    pipe_table_delimiter_row [1, 0] - [1, 9]
      pipe_table_delimiter_cell [1, 2] - [1, 3]
      pipe_table_delimiter_cell [1, 6] - [1, 7]
    pipe_table_row [2, 0] - [2, 9]
      pipe_table_cell [2, 2] - [2, 4]
    pipe_table_row [3, 0] - [3, 9]
      pipe_table_cell [3, 2] - [3, 4]
      pipe_table_cell [3, 6] - [3, 8]

As you can see, if jumping between pipe_table_cell, we would jump directly from the cells 1 to 3. Is this something possible or that doesn't break any of the standards?

MDeiml commented 10 months ago

I agree, the output should contain a node even for empty cells. The problem is that nodes in treesitter cannot be empty (unless they're parsed by the "external scanner"). Cell nodes at the moment do not contain the | delimiter, which I think makes sense, so they are indeed empty sometimes (or only contain whitespace).

I think this is quite easily fixable by moving the cell node to the external scanner. I'll try to do that (but don't expect a solution to soon, I'm a bit busy at the moment)

mpasa commented 10 months ago

Much appreciated! Thanks for your work

MDeiml commented 7 months ago

As mentioned in the PR there is still an issue with table cells that are completely empty, but cells that contain whitespace get marked now.