Rules in the stella runtime can access the field name of a child node. This is currently unimplemented in ddsa.
What is your solution?
Primer on field names
A tree-sitter node that is a child can be referred to be a field name, however, it's important to clarify that it is not "the node's field name" -- it is "a field name linked to a node (only because it's the child of another)".
Not every child has a field name:
For the given JavaScript, we have the tree:
function echo(a, b, c) {
// Implementation here
}
In this example, the (function_declaration) node has three children, and they all have field names associated with them:
(identifier), with field name "name"
(formal_parameters), with field name "parameters"
(statement_block), with field name "body".
The (formal_parameters) node likewise has three children, but none of them have field names
(identifier), with no field name
(identifier), with no field name
(identifier), with no field name
Serialization
ddsa is designed around passing around smis, and this use case is no different.
When we fetch the children of a node, we now need to pass in an additional id so that a rule can look up its string name. This is done by passing a tuple for each child, where it is now (NodeId, FieldId) instead of just NodeId. tree-sitter itself uses one-based field ids, so we can conveniently use 0 to indicate the absence of a field name.
(The contents of the Uint32Array that is sent to v8)
| Node A | Node B |
| NodeId A | FieldId A | NodeId B | FieldId B |
________________ ________________ ________________ ________________
| | | | |
0 1 2 3
32 bits 32 bits 32 bits 32 bits
Deserialization
When a child has a field name, we effectively "wrap" the TreeSitterNode with another object (a newly-introduced class) that provides the field name. To callers, this has the exact same interface as a TreeSitterNode, but it additionally contains fieldName. For a visual representation, when console.logged:
In order to performantly access the field ids, we need to perform child lookup manually using the cursor API instead of the provided iterators. What we end up doing here is copy/pasting the binding's iterator implementation, but adding a call to cursor.field_id() where required.
Alternatives considered
Instead of wrapping the TreeSitterNode with TreeSitterFieldChildNode, do something like export class ChildClass extends TreeSitterNode { /* ... */ }, and just add an additional property for the field name here. In theory, that should provide faster property lookup than our current solution because the prototype chain won't need to be walked for every access. I passed on this because my intuition is that my current solution makes better tradeoffs since it will prevent duplicate allocations (e.g. of the start object, _cachedText, etc)
What the reviewer should know
This provides the exact same functionality as the stella "fieldName"
What problem are you trying to solve?
Rules in the stella runtime can access the field name of a child node. This is currently unimplemented in ddsa.
What is your solution?
Primer on field names
A tree-sitter node that is a child can be referred to be a field name, however, it's important to clarify that it is not "the node's field name" -- it is "a field name linked to a node (only because it's the child of another)".
Not every child has a field name: For the given JavaScript, we have the tree:
In this example, the
(function_declaration)
node has three children, and they all have field names associated with them:(identifier)
, with field name "name"(formal_parameters)
, with field name "parameters"(statement_block)
, with field name "body".The
(formal_parameters)
node likewise has three children, but none of them have field names(identifier)
, with no field name(identifier)
, with no field name(identifier)
, with no field nameSerialization
ddsa is designed around passing around smis, and this use case is no different.
When we fetch the children of a node, we now need to pass in an additional id so that a rule can look up its string name. This is done by passing a tuple for each child, where it is now
(NodeId, FieldId)
instead of justNodeId
. tree-sitter itself uses one-based field ids, so we can conveniently use 0 to indicate the absence of a field name.Deserialization
When a child has a field name, we effectively "wrap" the
TreeSitterNode
with another object (a newly-introduced class) that provides the field name. To callers, this has the exact same interface as aTreeSitterNode
, but it additionally containsfieldName
. For a visual representation, when console.logged:Getting field ids
In order to performantly access the field ids, we need to perform child lookup manually using the cursor API instead of the provided iterators. What we end up doing here is copy/pasting the binding's iterator implementation, but adding a call to
cursor.field_id()
where required.Alternatives considered
TreeSitterNode
withTreeSitterFieldChildNode
, do something likeexport class ChildClass extends TreeSitterNode { /* ... */ }
, and just add an additional property for the field name here. In theory, that should provide faster property lookup than our current solution because the prototype chain won't need to be walked for every access. I passed on this because my intuition is that my current solution makes better tradeoffs since it will prevent duplicate allocations (e.g. of thestart
object,_cachedText
, etc)What the reviewer should know