stsewd / tree-sitter-rst

reStructuredText grammar for tree-sitter
https://stsewd.dev/tree-sitter-rst/
MIT License
50 stars 7 forks source link

Incorrect highlighting of docstrings #22

Closed 9seconds closed 2 years ago

9seconds commented 2 years ago

Hi!

As far as I understood (https://github.com/tree-sitter/tree-sitter-python/issues/137) right now Python docstrings are highlighted as RST. With a latest installed parser, I have this result:

def foo():
    """This is highlighted correctly"""

def bar():
    """Highlight here is
       broken
"""

def baz():
    """
    And here I have only quotes highlighted
    """
CleanShot 2021-12-10 at 22 52 51@2x

If I remove compiled $HOME/.local/share/nvim/site/parser/rst.so, then syntax is highlighted correctly. If I reinstall it, then the problem appears again.

stsewd commented 2 years ago

Hi. This is being parsed correctly from the side of the grammar, the text extracted would take the form of

Highlight here is
       broken

which is a definition list on rst. Since rst is sensible to indentation, what we need to do (from the nvim-treesitter side) is to change the range of the injection to the common indentation, but even with that your expression will be parsed as a definition list, you can check this with dedent

>>> import textwrap
>>> s = """Highlight here is
...        broken
... """
>>> textwrap.dedent(s)
'Highlight here is\n       broken\n'

Python recommends putting the first line below """ when you have a multineline docstring (and have a newline separate the second line)

def bar():
    """
    Highlight here is
    broken
    """

# or

def bar():
    """
    Highlight here is broken

    More text here
    """

But we could also check how sphinx handles that, if this is common in your codebase, you could write two injections, one that matches the first line only and other that matches the rest of the lines of a docstring.