pxp-lang / pxp

A suite of high-performance tools for PHP developers – includes a code formatter, static analyser, language server and superset language.
https://pxplang.org
Other
784 stars 0 forks source link

Introduce a "Type Mapper" component #27

Closed ryangjchandler closed 9 months ago

ryangjchandler commented 11 months ago

In order to perform static analysis of types, provide autocompletion suggestions in the language server, and even transpile some more complex features in the superset language, we need to be able to gather type information for a given program or file.

Now that #25 has been completed, we can start to implement a "type mapper" component. The goal of this component is to traverse an AST and build up a map of type information.

As an example, take the following code.

use App\Models\User;

function user(): ?User
{
    return Auth::user();
}

$user = user();

The type mapper would do the following things for this code:

  1. Walk the AST, going through all of the statements and expressions.
  2. When it gets to an "interesting" node, it will try to evaluate the types found inside of that node.
  3. The user() function here has a return type, but that return type is already being stored inside of an Index. The type mapper doesn't need to do anything with that, but it does care about the actual return type of that function.
  4. Since AST traversal is top-down, we can say that return statements are "interesting". When we reach one inside of the user() function, we can do some type inference / deduction to figure out what type of value that is returning.
  5. That deduction result can be stored in some sort of map, perhaps types: HashMap<NodeId, Type>, or a more specific return_types: HashMap<NodeId, (Span, Type)>.
  6. When the type checker then decides it needs to type check that function, it should look at the return_types map, get all of the return types for NodeId of the function itself, then check each one to see if it's compatible.
  7. When it gets to the $user variable assignment, we could store the return type of user() inside of the type map too, as well as the actual type of $user. PHPStan does a similar thing for type caching, but instead of using IDs it serialises the Expr into a string and uses it as the array key so for multiple checks on user(), it only has to do it once (assuming the Expr is actually doing the same thing)

Yes, this does involve traversing the AST more than once just to analyse the file. But the benefit here is that the type deduction logic can be re-used in the static analyser and language server using a simple NodeId as the identifier.