DataDog / datadog-static-analyzer

Datadog Static Analyzer
https://docs.datadoghq.com/static_analysis/
Apache License 2.0
100 stars 12 forks source link

[STAL-1960] Introduce bridge design; implement ddsa context #384

Closed jasonforal closed 4 months ago

jasonforal commented 4 months ago

What problem are you trying to solve?

We want to provide access to the tree-sitter tree (among other relevant metadata) to JavaScript rule execution.

What is your solution?

This PR does two main things:

1. Introduce code structure and patterns for Rust <> v8 interop

The code structure has three components:

  1. struct js::{ArbitraryName} - A thin layer over a JavaScript object (already exists).
    • Serializes/deserializes the JavaScript object it represents (which is physically adjacent as {ArbitraryName}.js).
    • Exposes APIs to manipulate instances of the JavaScript object so that the JavaScript end is opaque to the caller (an example caller is a Bridge (#3)).
    • Tests any relevant {ArbitraryName}.js behavior that isn't straightforward (for example, use of deno ops)
  2. struct ddsa_lib::{ArbitraryName} - The Rust representation of an "ArbitraryName" and its data (new via this PR).
    • Doesn't know/care that it has a complementary v8 representation.
    • Provides an API to manipulate Rust data contained by the struct.
    • Tests any relevant logic about updating Rust data.
  3. struct {ArbitraryNameBridge} The orchestrator of Rust <> V8 (new via this PR).
    • Exposes the public API that the upcoming ddsa::JsRuntime will consume.
    • Under the hood, a bridge is responsible for keeping #1 and #2 in sync, as indicated by the "Linked" abstraction.

It will only be possible to mutate data via a bridge API With this architecture, #3 is the brains behind the the Rust <> v8 interop, and #1 and #2 are ignorant about how they are being used.

2. Implement "Context"

Resulting file organization

├── ddsa_lib
│   ├── bridge
│   │   └── context.rs      <- #3
│   ├── bridge.rs
│   ├── context
│   │   ├── file.rs         <- #2
│   │   ├── root.rs         <- #2
│   │   └── rule.rs         <- #2
│   ├── context.rs          <- #2
│   ├── extension.rs
│   ├── js
│   │   ├── __bootstrap.js
│   │   ├── context_file.js
│   │   ├── context_file.rs <- #1
│   │   ├── context_root.js
│   │   ├── context_root.rs <- #1
│   │   ├── context_rule.js
│   │   ├── context_rule.rs <- #1

Design constraints

The bridge design was chosen because we don't allocate new "Context" v8 objects for every execution (like a more straightforward, but less performant solution would do). And so because this means we have to use a v8 global object (i.e. one that has been leaked and isn't tracked by the v8 garbage collector), we need to keep state in sync and ensure no side effects from one execution to the other. A nice design for this is to have #1 and #2 be "dumb" structs that are operated in conjunction by a bridge and hidden by the public API.

Technical notes

Alternatives considered

What the reviewer should know