DataDog / datadog-static-analyzer

Datadog Static Analyzer
https://docs.datadoghq.com/static_analysis/
Apache License 2.0
100 stars 12 forks source link

[STAL-2591] Add ability to look up a node's parent from JavaScript #471

Closed jasonforal closed 2 months ago

jasonforal commented 2 months ago

What problem are you trying to solve?

The JavaScript runtime needs to be able to look up the parent of a CST node.

What is your solution?

Per-tree (parent <> child) relationship caching

Tree-sitter nodes do not contain a link to their parent, but they do contain a link to their children (via the subtree pointer). Thus, to get the parent of an arbitrary Node A, we must traverse down the tree from the root until we reach a node that has Node A as a child.

An expected access pattern is for the JavaScript runtime to request a traversal up the tree (i.e. request a node's parent, and then its parent's parent, and so on). For example:

// Find the function containing this `varDeclNode`
let node = varDeclNode;
while (node.type !== "function_declaration") {
    node = ddsa.getParent(node);
}

Needing to handle this access pattern performantly necessitates a cache. If we naively traverse from the root down to the parent each time, we end up with O(D^2) time complexity (where D is the depth of the original node within the tree). Trees can get quite deep, and so this can have a noticeable impact on performance. By trading off space (40 bytes per child -> parent relationship), we bring that time complexity down to O(D).

The cache logic is extensively tested.

OpSafeRawTSNode: Safety

We need another way of creating a RawTSNode so that it can be accessed in tree_sitter::Node form via a deno op. We previously could only do this by reading a RawTSNode from the TsNodeBridge (which is guaranteed safe). However, that isn't applicable here because we don't (and don't want to) preemptively serialize the nodes to v8 (i.e. they are not already contained in the bridge). We address this by implementing the cache on ddsa_lib::RootContext, and then making a small addition to OpSafeRawTSNode to uphold the safety guarantee. Added documentation on safety is here and here.

ddsa JavaScript API

The last thing this PR implements is the actual deno op and JS bindings. A rule can now call:

const parent = ddsa.getParent(node);

This functionality will be used in future additions to the JavaScript runtime.

Alternatives considered

What the reviewer should know