scalacenter / tasty-query

Apache License 2.0
52 stars 11 forks source link

TASTy Query

TASTy Query is a compiler-independent library to semantically analyze TASTy - an intermediate representation of Scala 3 code.

It scans the classpath to build a map of all definitions in a project, whether they are defined in Java, Scala 2 or Scala 3. Its API allows users to query semantic information about the project:

In addition, for Scala 3 code, it provides access to the full Trees, allowing for deeper inspection of the code.

Usage

Add the following dependency to your build:

libraryDependencies += "ch.epfl.scala" %% "tasty-query" % "<latest-version>"
// or %%% from Scala.js

You can find the latest release in the Releases list on GitHub.

Head over to the latest API docs to see what's available. To get started, create a Classpath using ClasspathLoaders.read. Please note that TASTy Query requires that all classes, such as the JRE, must be explicitly added to the classpath. On the JDK versions >= 9, the JRE classpath can be obtained on the JVM platform with FileSystems.getFileSystem(java.net.URI.create("jrt:/")).getPath("modules", "java.base"). On the JavaScript plaform the contents of this path need to be saved as a JAR file in the real file system. Then, create a Context object using Context.initialize(classpath). From there, follow the available methods to access Symbols, Types and Trees.

Motivation

Preamble: TASTy and separate compilation

When compiling a codebase, a Scala compiler combines the source files of a project with the "binaries" of its dependencies. Since the project source files can reference symbols (all kinds of definitions in a Scala program, like vals and classes) from the dependencies, the compiler builds a representation of the dependencies.

In Scala 2, this representation is limited to public definitions and their API-level types (also protected and package-private, but for the intent of this explanation, we refer to all of that as "public"). This information is stored in Scala 2 class files as "pickles". The bodies of methods are excluded from this treatment.

In Scala 3, this representation is stored in so-called TASTy files. They contain the complete AST of the program, with all their public definitions and their bodies. Or to be precise, enough information is stored to be able to reconstruct everything (in particular, the types of subexpressions in method bodies).

This representation is used to type-check (and further "elaborate") the project source files. In Scala 3, it is also used to process inline defs, which is why the full trees for method bodies are required.

Definitions

Reading TASTy files

TASTy files are complex beasts. They represent everything there is to know about the semantics of a Scala 3 program, as well as some non-semantic information (like positions). Whereas Java class files can be read in isolation, and processed to some extent without knowing the classpath context, it is virtually impossible to make any sense of TASTy files without a full classpath. There are several reasons for that, including the following:

Reading TASTy files is therefore complex, and requires a full classpath to make sense of it. The product is a multi-dimensional graph involving symbols, trees and types.

tasty-query's job is to read TASTy files for you, and present the information it contains (explicitly or implicitly) in as convenient a way as possible.

Use cases

TASTy-MiMa

In order to perform its job, TASTy-MiMa needs to load TASTy files and extract semantic information out of them. This is precisely what tasty-query is built for.

Like MiMa, TASTy-MiMa faces a particular challenge that the compiler itself is not built to handle: the ability to load symbols and types from different classpaths and somehow relate them. This is the most critical reason for which we cannot use the compiler's own ability to read TASTy files to implement TASTy-MiMa.

TASTy-MiMa needs the following "features" from tasty-query:

Other use cases

Here are a few other use cases for tasty-query:

The very existence of tasty-query also keeps the compiler honest about what TASTy really is, how it is defined, and what it means. We can build a "TASTy verifier" that only performs type checking, and verify that the compiler's output abides by the rules.

Contributing

Thank you for wanting to contribute! Please read our contributing guide.