Instagram / LibCST

A concrete syntax tree parser and serializer library for Python that preserves many aspects of Python's abstract syntax tree
https://libcst.readthedocs.io/
Other
1.56k stars 192 forks source link

Dump CST to .dot (graphviz) files #1147

Closed zaicruvoir1rominet closed 6 months ago

zaicruvoir1rominet commented 6 months ago

Summary

Currently, the way to "show" a LibCST's CST is to print it out, using libcst.tool.dump.
This is great when working with LibCST on a day to day basis.

However, there are currently no ways to "show" a CST to people not involved with LibCST.
For instance, in LibCST conferences, or when working on educational material, or when pitching LibCST, there is nothing to "display".

This PR adds a way to dump a CST in a .dot file (or stdout), which can be used in graphviz to "show" the CST.

Examples

When dumping the CST from fn(1, 2) - just like LibCST doc, with dump_graphviz: Image1

When dumping fn(1, 2) - without any filters: Image2

Note that this just produces the .dot files, Graphviz still needs to be called after to produce the visual graphs. I did not want to add a direct dependency between LibCST and Graphviz.

Changes

This PR adds the libcst.display module, with:

libcst.tool.dump has been updated to import libcst.display.text.dump (legacy reasons), and can also call "print-graphviz" if used from command line.

Still TODO

This is still a prototype, and I am not sure if you would consider it to be part LibCST, or a side library.
If this PR is OK to be part of LibCST, I will go on with code cleanup, tests and documentation.

codecov[bot] commented 6 months ago

Codecov Report

Attention: Patch coverage is 94.89051% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 91.16%. Comparing base (efc53af) to head (b2293d1).

:exclamation: Current head b2293d1 differs from pull request most recent head b879a61. Consider uploading reports for the commit b879a61 to get more accurate results

Files Patch % Lines
libcst/tool.py 50.00% 4 Missing :warning:
libcst/display/graphviz.py 97.61% 1 Missing :warning:
libcst/display/tests/test_dump_graphviz.py 97.67% 1 Missing :warning:
libcst/display/text.py 97.56% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1147 +/- ## ========================================== + Coverage 91.15% 91.16% +0.01% ========================================== Files 258 262 +4 Lines 26689 26784 +95 ========================================== + Hits 24328 24418 +90 - Misses 2361 2366 +5 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

zsol commented 6 months ago

I like the idea of visualizing the CST, I'd definitely like that to be part of LibCST itself. I think static representations (like a text dump or even an image produced by graphviz) have the same general limitation: they quickly get too large on any real world Python code. Interactive visualizations (like https://libcst.me that @aleivag and I hacked together) have a way of cutting away unnecessary complexity, and are a much better way of getting started for people unfamiliar with syntax trees IMO. I'd be open to adding something like that to LibCST as an installable extra to keep dependencies minimal.

As for this specific PR, I'm happy to merge it if you spend the energy cleaning it up! I think it'd make sense for the dot output to be a flag to the existing dump tool instead of a new subcommand.

zaicruvoir1rominet commented 6 months ago

About the PR

I think it'd make sense for the dot output to be a flag to the existing dump tool instead of a new subcommand.

Done

Also, did I break CI Coverage in a way ?

About the rest

I think static representations (like a text dump or even an image produced by graphviz) have the same general limitation

Entirely true.
The main goal of this visual output is to make people realize they are working on a graph for the first couple of small examples ;
before switching to the text only repr.

Interactive visualizations (like https://libcst.me/ that @aleivag and I hacked together)

This is awesome !
It would be great to include that somewhere in the documentation, even if it's experimental/not-finished/hacked, it's already great.

If I may add a little something, it would be great if nodes could be collapsed, to be able to focus on essential parts (though I don't know how hard it is to implement)

zsol commented 6 months ago

PR looks great, thank you!

did I break CI Coverage in a way ?

No, the codecov integration in this repo is barely limping along. I haven't had time to investigate yet

This is awesome !

Thanks <3

It would be great to include that somewhere in the documentation, even if it's experimental/not-finished/hacked, it's already great.

It's using a very old version of libcst (0.3.x) and I don't really want to mislead people. I've built libcst binary wheels for pyodide now, so when pyscript upgrades its pyodide version we'll be able to use a newer version of libcst, and then we can include it in the docs. Likely a couple of months out.

If I may add a little something, it would be great if nodes could be collapsed, to be able to focus on essential parts (though I don't know how hard it is to implement)

Agreed. File an issue at https://github.com/aleivag/libcst-sandbox ?