tophat / codewatch

[deprecated] Monitor and manage deeply customizable metrics about your python code using ASTs
https://codewatch.io
Apache License 2.0
38 stars 3 forks source link

Keeping a `ast.NodeVisitor` compatible API may have performance limitations #14

Open cabiad opened 5 years ago

cabiad commented 5 years ago

See https://github.com/tophat/codewatch/pull/1#discussion_r233602516

Another problem with the NodeVisitor API is that it couples registering a visitor subclass (batch of related node visitors) to walking (and visiting) the AST.

It's not clear to me whether re-walking each AST for each subclass would scale reasonably if we had dozens or hundreds of registered subclasses. In other words, would we linearly increase total execution time for a given set of ASTs or would some level of caching dominate, making the subsequent AST walks negligible.

If we see performance problems, a clear optimization path to consider is to walk once and call each visitor (method) registered for that node type

cabiad commented 5 years ago

Flame graph suggests that astroid.parse() is the primary slowdown. Walking the tree multiple times is not the slowest part of the process.