github / stack-graphs

Rust implementation of stack graphs
https://docs.rs/stack-graphs/*/stack_graphs/
Apache License 2.0
712 stars 122 forks source link

Modeling nested scoping #276

Open hendrikvanantwerpen opened 1 year ago

hendrikvanantwerpen commented 1 year ago

Problem

How to model nested scoping in stack graphs? The problematic pattern is illustrated by the following lambda term:

(\x\y.x) t t'

The question is what the stack graph model is that ensures the result of this whole term is t.

This problem occurs in dynamic languages with nested functions where we model data flow (e.g., JavaScript & Python), as well as in static languages with nested type abstractions (i.e., TypeScript & Java).

MWE

I created a minimal(-ish) working example using JavaScript that illustrates this problem, which can be found in tree-sitter-code-nav/tree/mwe-nested-scoping. The JavaScript rules are stripped down to set that implements a lambda calculus. The examples directory contains several examples of different cases. The file id.js defines an id function, which works as expected. The file const.js defines a cnst function with a nested function which exhibits the problem. For comparison, there is also nested-calls.js, which does a regular function call from a function body.

The JavaScript program exhibiting the problem:

let o1 = {a:{}};
let o2 = {b:{}};

let cnst = function (x) {
    return function (y) {
        return x;
    }
}

cnst(o1)(o2).a;

The ccorresponding stack graph: image

Discussion

The core of the problem is that this kind of scoping results in two consecutive scoped pop nodes. The second pop replaces the scope stack coming from the first pop, which is lost. However, because the inner function can see the enclosing function, a reference the parameter of the outer function (y in our example) steps out of its own scope into the surrounding scope. In this surrounding scope, the the scope stack from the first push should be used for any jumps, instead of the scope stack resulting from the pop of the inner function.

In the MWE, the point where we step out of the inner scope is marked by a DROP-ONE node (here in the rules). However, this does not have the intended behavior. It pops the top of the scope stack, but it should instead restore the outer scope stack somehow.

Possible Solution

I have sketched my idea of using a stack of scope stacks, instead of a single stack. The top of that scope stack stack behaves as the current scope stack, but the remainder of the scope stack stack is captured on push and restored on jump, so that we always restore the complete context as it was at the push. A drop simply drops the top, going to the scope context as it was before the last push (which corresponds to stepping out of the scope of the nested function into the enclosing function).

Edit: Instead of scope stack stack, I am now calling it a scope context, which contains scope stacks.

Stack graph rules: stack-graphs.pdf

Rules applied to the example to illustrate the effect during resolution: paths-for-example.pdf

hendrikvanantwerpen commented 1 year ago

Original comment from @jorendorff:

In one style of simple lambda calculus interpreter, at every moment, there is an environment, or scope chain, a linked list of names and values. This list reflects the lexical nesting of scopes (lambda forms) in the language.

Also, there is a call stack that reflects the run-time nesting of function calls in the language. Each stack frame has its own scope chain.

Every environment can contain many values, and the values in turn can contain environments (because lambdas capture the current environment). In general, the program state is at least a binary tree. If the stack of stacks here is meant to represent program state, I doubt it's up to implementing Scheme.