Option to preserve comments in the AST

pawamoy commented 7 months ago

Hello, thanks for this wonderful piece of software :slightly_smiling_face:

I'm maintaining mkdocstrings, which uses Jinja templates to render API documentation for different languages. Recently I had this idea of doing the same thing (collecting and rendering API docs) for... Jinja templates themselves :stuck_out_tongue: It's just an idea for now, but I'll be tracking my progress here: https://github.com/mkdocstrings/mkdocstrings/issues/661.

I see that Jinja provides the Environment.parse method, which builds an AST of the Jinja source: this is fantastic. I'll be able to visit such trees to build useful information and store it in structures that make it easy to render them in other Jinja templates.

However... Jinja comments are not preserved :cry: This https://github.com/pallets/jinja/issues/719#issuecomment-303933811 says that they're probably discarded by the lexer.

I can already extract lots of useful information from templates, but I had in mind to allow template writers to use Jinja comments to document variables, filters, blocks, macros, whatever, directly in the template, just like we document attributes, functions, classes, etc., directly within Python code.

For this I would need an option to tell Environment.parse to preserve the comments in the generated AST. Do you think that is feasible? I can definitely free up some of my time to work on this.

In any case, what do you think of this idea of auto-documentation for Jinja templates? I think that would be a wonderful tool in the Jinja ecosystem :smile: I personally have lots of Jinja templates that I would document this way :slightly_smiling_face:

pawamoy commented 7 months ago

I'm happy to maintain a patched lexer/parser in an external package if you don't feel comfortable implementing this option in core. I'd greatly appreciate some guidance on how to change the lexer/parser to preserve comments :slightly_smiling_face:

I suppose I must somehow stop ignoring comments in the lexer, and also create a new Comment node that could be added to the AST.

pawamoy commented 7 months ago

OK I managed to get something working :slightly_smiling_face:

Basically:

remove comment tokens from ignored set
add a comment node
handle comment tokens in subparse

ignored_tokens = frozenset(
    [
        # TOKEN_COMMENT_BEGIN,
        # TOKEN_COMMENT,
        # TOKEN_COMMENT_END,
        TOKEN_WHITESPACE,
        TOKEN_LINECOMMENT_BEGIN,
        TOKEN_LINECOMMENT_END,
        TOKEN_LINECOMMENT,
    ]
)

class Comment(Stmt):
    """A template comment."""

    fields = ("data",)
    data: str

# in Parser.subparse
elif token.type == "comment_begin":
    flush_data()
    next(self.stream)
    body.append(nodes.Comment(next(self.stream).value))
    self.stream.expect("comment_end")

Printing the AST gives something like:

Template(body=[Comment(data=' Admonitions template. '), ...])

pawamoy commented 1 month ago

Bump :slightly_smiling_face: Would maintainers be interested in such functionality? I think it would allow nice developments in the Jinja ecosystem :slightly_smiling_face:

davidism commented 1 month ago

Yeah, I'm open for reviewing a PR for this, but I just have very limited time to focus on new features for Jinja right now.

pawamoy commented 1 month ago

Fantastic :relaxed: No problem, I'll do my best to send the most reviewable PR and most easily maintained code :yum:

pallets / jinja

Option to preserve comments in the AST #1967