mkdocstrings / griffe

Signatures for entire Python programs. Extract the structure, the frame, the skeleton of your project, to generate API documentation or find breaking changes in your API.
https://mkdocstrings.github.io/griffe
ISC License
267 stars 38 forks source link

feature: Format/highlight expressions? #254

Open pawamoy opened 2 months ago

pawamoy commented 2 months ago

Is your feature request related to a problem? Please describe.

In mkdocstrings, to format code, highlight it, and cross-ref names, we have to dance a bit:

It feels super convoluted (back and forth between Python code and Jinja templating), and inefficient (Griffe parsed code with ast, then Black parses it again, then Pygments lexes it again...).

Furthermore, pygments is "only" a lexer, so that means it cannot distinguish between the name of a parameter and a name used as value, and potentially other similar cases. That means we don't get classes distinct enough to allow users to style their code with enough flexibility, based on semantics rather than tokenization.

Describe the solution you'd like

I wonder if we couldn't pack all this (formatting + highlighting) directly in our expressions.

The highlighting part is easy: Griffe already parsed code as an expression, so we have all the semantics associated with each part of the expression. For example, in a function call, we know that the ExprName used for the names of keyword parameters are parameter names, and not just names, so we could easily wrap them in <span class="pn">...</span> or whatever.

The formatting part however is probably much more complex. I don't have the pretension to think I'm capable of writing something as qualitative and efficient as Black or Ruff. But maybe there's a way? For now, Griffe expressions only handle single statements (type annotations, signatures, assignments). What when we start supporting arbitrary code in expressions?

Describe alternatives you've considered

We could consider never formatting code ourselves, but use a CST instead of an AST, to preserve already existing formatting. Users would format the code themselves (there are plenty of tools to do that), and we would just keep the same formatting. However with Griffe's ability to transform expressions (for example expression modernization, and Annotated unwrapping), we will have to update the spacing anyway.

Maybe a combination of both would be enough? We preserve original formatting, therefore the formatting complexity is drastically reduced so we can naively format the transformed parts ourselves?

Benefit of retaining original formatting: we respect the users choice (comments enabling/disabling formatting for example).

Additional context