Scope graph testing - Githubissues

Scope graphs hold a treasure trove of information about programs we analyze. For the various name resolution goals like jump to definition or find all references, scope graphs give us a convenient and powerful way of answering questions about "what is this name's value?" along with "where does this name occur?" and "how is this name in scope?".

However, in addition to making this data available through services like aleph and alephd via semantic and soon semantic-core, we also want to ensure the scope graphs we produce are as correct as possible. One of the best ways we can do that is via testing and analyzing the scope graphs we produce.

Static testing

For static testing of scope graphs, I propose the setup of one or more test harnesses for testing the correctness and structure of scope graphs at different levels of granularity:

Specific syntax per language - this can help us test special edge cases with targeted, isolated examples
Blob level (i.e. file level) - we may be able to reuse existing fixtures for these.
Project level (i.e. repo) - service level (based on the data provided in a request to alephd, for example)
Project level via commit sha

Property testing

Now that we have generators for scope graphs, we can verify certain properties of scope graphs in relation to storage:

Inserting of named declarations
Inserting of named references

Property testing could also be useful for functions operating over scope graphs (e.g. declaration lookup using a path predicate function).

Ideally any test harness would be possible to migrate to semantic-core with minimal effort. For static tests whose source is surface-level programming language syntax, the scope graph produced by semantic-core might be shaped slightly differently compared with semantic, but the test harness approach should still be viable.

Following up after a conversation with @robrix, the goals for testing scope graphs in relation to semantic-core can be roughly thought of in the following segments:

Possibly testing against surface language -> translation.
Testing against translation -> core.
Testing against core -> scope graph.

Breaking up the testing focus on the intermediate stages should give us more confidence about the correctness of each stage of the surface language -> scope graph pipeline, and helps us test code in better isolation that is easier to reason about. It also reduces duplication of fixtures. In contrast, directly testing against surface language -> scope graph is problematic for a few reasons:

The testing surface is too broad. If any of the intermediate stages change (e.g. translation DSL, or core) the resulting changes to the scope graph structure (if any) can be difficult to reason about. This makes a test failure a poor signal that something is wrong, and can instead create unnecessary CI build failures.
Fixtures at the surface language level are already difficult to control. How do we know that we've exhausted all possible permutations of Python import statements in relation to scope graph generation? Reasoning about changes to the scope graph from a surface language level is hard because of the intermediate stages of translation and core.
Surface language level testing in order to catch quirks or corner cases of translation or core requires deep knowledge of the entire stack in order to verify some aspect of scope graph generation. That knowledge isn't communicated very well with surface language level fixtures (and worse is completely implicit and easy to miss for others).
The fixtures for each surface language we support become duplicative very quickly if we're only wanting to test against the structure of scope graphs.
This duplication is a real maintenance burden and adds friction.

For property testing, we will likely have more success property testing operations on scope graphs. The operations in Data.Abstract.ScopeGraph are prime targets for property tests.

🎩 to @robrix for helping think through this as we move towards a semantic-core future, and what it means for generating and testing scope graphs 🙇

github / semantic

Scope graph testing #81

Static testing

Property testing