github / semantic

Parsing, analyzing, and comparing source code across many languages
8.97k stars 453 forks source link

Scope graph testing #81

Open rewinfrey opened 5 years ago

rewinfrey commented 5 years ago

Scope graphs hold a treasure trove of information about programs we analyze. For the various name resolution goals like jump to definition or find all references, scope graphs give us a convenient and powerful way of answering questions about "what is this name's value?" along with "where does this name occur?" and "how is this name in scope?".

However, in addition to making this data available through services like aleph and alephd via semantic and soon semantic-core, we also want to ensure the scope graphs we produce are as correct as possible. One of the best ways we can do that is via testing and analyzing the scope graphs we produce.

Static testing

For static testing of scope graphs, I propose the setup of one or more test harnesses for testing the correctness and structure of scope graphs at different levels of granularity:

Property testing

Now that we have generators for scope graphs, we can verify certain properties of scope graphs in relation to storage:

Property testing could also be useful for functions operating over scope graphs (e.g. declaration lookup using a path predicate function).


Ideally any test harness would be possible to migrate to semantic-core with minimal effort. For static tests whose source is surface-level programming language syntax, the scope graph produced by semantic-core might be shaped slightly differently compared with semantic, but the test harness approach should still be viable.

rewinfrey commented 5 years ago

Following up after a conversation with @robrix, the goals for testing scope graphs in relation to semantic-core can be roughly thought of in the following segments:

Breaking up the testing focus on the intermediate stages should give us more confidence about the correctness of each stage of the surface language -> scope graph pipeline, and helps us test code in better isolation that is easier to reason about. It also reduces duplication of fixtures. In contrast, directly testing against surface language -> scope graph is problematic for a few reasons:

For property testing, we will likely have more success property testing operations on scope graphs. The operations in Data.Abstract.ScopeGraph are prime targets for property tests.

🎩 to @robrix for helping think through this as we move towards a semantic-core future, and what it means for generating and testing scope graphs 🙇