tophat / codewatch

[deprecated] Monitor and manage deeply customizable metrics about your python code using ASTs
https://codewatch.io
Apache License 2.0
38 stars 3 forks source link

Support explicit ordering of visitor functions rather than purely relying on declaration order #84

Open noahnu opened 5 years ago

noahnu commented 5 years ago

Is your feature request related to a problem? Please describe.

Ensuring a strict ordering on the execution of "visitors" is necessary if we want to guarantee deterministic assertions. If visitor_a increments counter_a for some node of type X and visitor_b increments counter_b for node of type X only if counter_a is greater than 0, then changing the ordering that visitor_a and visitor_b are called will change the final value of the computed statistics, meaning assertions may differ from the run that used different ordering.

While typing out this issue I was trying to come up with a good use case for having one visitor function depend on another. I could not. Perhaps someone else can think of an example?

Even if we were to consider this an anti-pattern, we should attempt to minimize possibility of error / flakiness.

Changing to a reducer-style approach for visitors may encourage inter-visitor dependencies as per https://github.com/tophat/codewatch/issues/10.

Currently order of execution of visitors is defined by declaration order, purely b/c we use a central registry with a global array that is appended to whenever the @visit decoration is executed. The use of a global registry seems like an anti-pattern, however in order to remove this registry, we'd need some way to guarantee visitor order (dir(module) sorts alphabetically, while module.__dict__.keys() is only guaranteed to be ordered in py3.6+).

Describe the solution you'd like

A pattern similar to https://pytest-dependency.readthedocs.io/en/latest/about.html#what-is-the-purpose, where you essentially mark the names of tests that must be run first. We can thus use a topological sort of the dependencies.

e.g.

@visit(nodes.FunctionDef)
def some_func(node, .., ..):
  pass

@visit(nodes.FunctionDef, predicate=None, depends_on=['some_func', 'some_other_func'])
def count_funcs(node, .., ..):
  pass

We'd examine "depends_on" to generate a graph and then topological sort. We could sort alphabetically for visitors with no dependencies (or that are tied in sorting order).

Describe alternatives you've considered

Additional context

https://tophat-opensource.slack.com/archives/CE14KJGET/p1543705988030800

francoiscampbell commented 5 years ago

If visitor_a increments counter_a for some node of type X and visitor_b increments counter_b for node of type X only if counter_a is greater than 0, then changing the ordering that visitor_a and visitor_b are called will change the final value of the computed statistics

I can see this being very brittle. A change in visitor_a's behaviour or its removal entirely can completely break visitor_b

noahnu commented 5 years ago

@francoiscampbell yeah. It feels like an anti-pattern. Maybe @lime-green can provide an example of why we want ordering?

lime-green commented 5 years ago

Ordering by declaration order seems the most natural to me, which is what we have right now. I don't think we're looking for inter-dependency between visitors just yet, there hasn't really been a need for it from my usage of the library at least.

IMO should just keep the logic as-is until it's obvious that there's a problem

francoiscampbell commented 5 years ago

Are we talking about the order of all visitors or per-node? Cause I was assuming that the order of all visitors would be the AST depth-first traversal order.

For a single node type, im not opposed to running its visitors in declaration order, although it has the potential to prevent any future changes or optimizations, since the visitor order effectively becomes part of the public API

lime-green commented 5 years ago

@francoiscampbell per-node; it's declaration order right now

francoiscampbell commented 5 years ago

Yeah, that's cool. I'd say let's not make that a guarantee, just an implementation detail for now