pallets / jinja

A very fast and expressive template engine.
https://jinja.palletsprojects.com
BSD 3-Clause "New" or "Revised" License
10.12k stars 1.6k forks source link

Option to preserve comments in the AST #1967

Open pawamoy opened 3 months ago

pawamoy commented 3 months ago

Hello, thanks for this wonderful piece of software :slightly_smiling_face:

I'm maintaining mkdocstrings, which uses Jinja templates to render API documentation for different languages. Recently I had this idea of doing the same thing (collecting and rendering API docs) for... Jinja templates themselves :stuck_out_tongue: It's just an idea for now, but I'll be tracking my progress here: https://github.com/mkdocstrings/mkdocstrings/issues/661.

I see that Jinja provides the Environment.parse method, which builds an AST of the Jinja source: this is fantastic. I'll be able to visit such trees to build useful information and store it in structures that make it easy to render them in other Jinja templates.

However... Jinja comments are not preserved :cry: This https://github.com/pallets/jinja/issues/719#issuecomment-303933811 says that they're probably discarded by the lexer.

I can already extract lots of useful information from templates, but I had in mind to allow template writers to use Jinja comments to document variables, filters, blocks, macros, whatever, directly in the template, just like we document attributes, functions, classes, etc., directly within Python code.

For this I would need an option to tell Environment.parse to preserve the comments in the generated AST. Do you think that is feasible? I can definitely free up some of my time to work on this.

In any case, what do you think of this idea of auto-documentation for Jinja templates? I think that would be a wonderful tool in the Jinja ecosystem :smile: I personally have lots of Jinja templates that I would document this way :slightly_smiling_face:

pawamoy commented 3 months ago

I'm happy to maintain a patched lexer/parser in an external package if you don't feel comfortable implementing this option in core. I'd greatly appreciate some guidance on how to change the lexer/parser to preserve comments :slightly_smiling_face:

I suppose I must somehow stop ignoring comments in the lexer, and also create a new Comment node that could be added to the AST.

pawamoy commented 3 months ago

OK I managed to get something working :slightly_smiling_face:

Basically:

ignored_tokens = frozenset(
    [
        # TOKEN_COMMENT_BEGIN,
        # TOKEN_COMMENT,
        # TOKEN_COMMENT_END,
        TOKEN_WHITESPACE,
        TOKEN_LINECOMMENT_BEGIN,
        TOKEN_LINECOMMENT_END,
        TOKEN_LINECOMMENT,
    ]
)
class Comment(Stmt):
    """A template comment."""

    fields = ("data",)
    data: str
# in Parser.subparse
elif token.type == "comment_begin":
    flush_data()
    next(self.stream)
    body.append(nodes.Comment(next(self.stream).value))
    self.stream.expect("comment_end")

Printing the AST gives something like:

Template(body=[Comment(data=' Admonitions template. '), ...])