stsewd / tree-sitter-rst

reStructuredText grammar for tree-sitter
https://stsewd.dev/tree-sitter-rst/
MIT License
50 stars 7 forks source link

detecting missing blank lines before section headers #47

Open keewis opened 10 months ago

keewis commented 10 months ago

At the moment, this:

paragraph

Section
-------
section content

is parsed to (document (paragraph) (section (title)) (paragraph)).

However, mistakenly omitting the blank line:

paragraph
Section
-------
section content

is parsed to (document (paragraph)), where the paragraph contains a node for each word and every single adornment character.

If we were to change the first line to a definition list:

term : classifier
    definition
Section
-------
section content

the parsed result would contain a error node:

(document (ERROR (list_item (term) (classifier) (definition (paragraph)))))

For comparison, with the blank line, the document parses as

(document (definition_list (list_item (term) (classifier) (definition (paragraph)))) (section (title)) (paragraph))

Would it be possible to always create an error node if there is a blank line missing before a section? If not, what kind of query would I need to use to detect something like this?

stsewd commented 10 months ago

Would it be possible to always create an error node if there is a blank line missing before a section?

I may look into what docutils does for cases like this.

If not, what kind of query would I need to use to detect something like this?

Since the text is parsed as a text paragraph, you'll need to manually check the text.

keewis commented 10 months ago

I may look into what docutils does for cases like this.

docutils appears to do something similar: warn / error (not sure how to interpret the output) if there's no blank line between definition list and section, but if there's a paragraph in front of the section concatenate it to that paragraph.

I guess the question now is: is there ever a case where you'd want to have something within a paragraph that resembles a section? If not, I'd argue that it is better to deviate from what docutils is doing and raise an error.