To implement tight interop between Rust and JavaScript, both languages need to have the same model of the data. This PR addresses 3 problems that this introduces:
Ensuring that even across future refactors, the Rust implementation and JavaScript implementation are always in sync (i.e. that they serialize/deserialize to the same v8 object shape).
Ensuring high performance serialization/deserialization to/from v8.
Doing all of this without requiring contributors to be experts in v8's lower level abstractions.
What is your solution?
1. Unit test canary:
Ensuring parity without codegen is tricky, especially because JavaScript isn't strictly typed. For example, imagine as part of a large refactor, we are adding numerous properties to Fix, one of which is a "category" string field. If we implemented this in Rust, but forgot to add "category" on the JavaScript side, v8 is happy to send a v8::Value (undefined) in response to a lookup of this property category. Depending on how the developer writes their unit tests (and the nature of the business logic), there might be some subtle, uncaught edge-cases.
Note: this isn't 100% robust (e.g. a refactor in Rust could introduce a logic "bug" independent of the JavaScript object's shape), but it serves as a canary to make a developer explicitly acknowledge the JavaScript drift.
2. Manually-specified serialization/deserialization:serde_v8 is slow and does not provide APIs to amortize allocations (which we can/should do because we know the shapes of our objects ahead of time). Furthermore, the deno team plans to eventually deprecate this crate.
This solution requires a struct be serialized/deserialized by specifying individual fields. I added some ergonomic functions for this, but the "real" solution is a procedural macro to make this declarative (out of this PR's scope).
3. File organization
This is still not "solved" in this PR, but the way I am trying to address this is to create consistent, logical structure to the file organization. Conceptually, what this PR proposes is:
edit.js - The JavaScript class
edit.rs - Only the Rust code necessary to serialize/deserialize. These files and structs are not used for any business logic (e.g. we convert from EditInstance to the struct internally used for business logic).
This should make the API boundaries more clear (and git history more readable).
Alternatives considered
Using a generalized deserializer like we currently use via serde_v8 (reason: we don't need to design for handling data of an unknown type).
Having Rust be the single-source-of-truth. Instead of defining classes (e.g. class Fix { ... }) in a JavaScript file, we would define all v8 objects directly via the v8 API. We would then need codegen to produce JavaScript type definitions. (reason: enormous complexity).
Writing a procedural macro to generate high performance serialization/deserialization in a declarative way (reason: out of scope).
What the reviewer should know
This is one of a chain of PRs fully implementing the ddsa_lib.
Edit, Fix, and Violation didn't "need" to be refactored, as their core JavaScript representation is basically the same. However, to keep the library unified, they were re-implemented as JavaScript modules instead of reusing the StellaEdit, StellaFix, and StellaError functions.
These structs were chosen to keep this PR small, because they are implemented basically identical to their stella.js counterparts. This is in contrast to other structs like TreeSitterNode and Capture, which need to be high performance, and so require a lot of under-the-hood refactoring on the Rust side of things.
What problem are you trying to solve?
To implement tight interop between Rust and JavaScript, both languages need to have the same model of the data. This PR addresses 3 problems that this introduces:
What is your solution?
1. Unit test canary: Ensuring parity without codegen is tricky, especially because JavaScript isn't strictly typed. For example, imagine as part of a large refactor, we are adding numerous properties to
Fix
, one of which is a "category" string field. If we implemented this in Rust, but forgot to add "category" on the JavaScript side, v8 is happy to send av8::Value
(undefined
) in response to a lookup of this propertycategory
. Depending on how the developer writes their unit tests (and the nature of the business logic), there might be some subtle, uncaught edge-cases.ddsa_lib
Deno extension as a single-source of truth.2. Manually-specified serialization/deserialization:
serde_v8
is slow and does not provide APIs to amortize allocations (which we can/should do because we know the shapes of our objects ahead of time). Furthermore, the deno team plans to eventually deprecate this crate.V8Converter
trait to simplify this.3. File organization This is still not "solved" in this PR, but the way I am trying to address this is to create consistent, logical structure to the file organization. Conceptually, what this PR proposes is:
edit.js
- The JavaScript classedit.rs
- Only the Rust code necessary to serialize/deserialize. These files and structs are not used for any business logic (e.g. we convert fromEditInstance
to the struct internally used for business logic).This should make the API boundaries more clear (and git history more readable).
Alternatives considered
serde_v8
(reason: we don't need to design for handling data of an unknown type).class Fix { ... }
) in a JavaScript file, we would define all v8 objects directly via the v8 API. We would then need codegen to produce JavaScript type definitions. (reason: enormous complexity).What the reviewer should know
This is one of a chain of PRs fully implementing the
ddsa_lib
.Edit
,Fix
, andViolation
didn't "need" to be refactored, as their core JavaScript representation is basically the same. However, to keep the library unified, they were re-implemented as JavaScript modules instead of reusing theStellaEdit
,StellaFix
, andStellaError
functions.stella.js
counterparts. This is in contrast to other structs likeTreeSitterNode
andCapture
, which need to be high performance, and so require a lot of under-the-hood refactoring on the Rust side of things.