Open matklad opened 1 year ago
In Djot, building a natural lossless tree would be a bit challenging. The "range of parent is disjoint union of children" won't work for indented constructs like lists or blockquotes
> ```
> fn main() {
> "hello"
> }
> ```
I think we can invent some reperesentation which would capture this, but it won't be trivial.
In general, my gut feeling that our guiding principle should be this:
At the same time, implementaions can provide a lossless "stream of matches" which can be used for all purposes where AST is not enough. Matches are not part of the spec, so implementations can approach the problem differently.
At the same time, implementaions can provide a lossless "stream of matches" which can be used for all purposes where AST is not enough.
Ha yeh, I was just about to ask why you feel this would not be sufficient
This is called a CST btw. Concrete syntax tree vs Abstract syntax tree
Inspired by https://github.com/jgm/djot/issues/105#issuecomment-1316110154
There's such thing as a lossless syntax tree. Basically, it's a tree with the constraint that it's possible to exactly recover the input string from it,
print . parse = id
. We might, or might not want to use it.Benefits of full-fidelity trees:
(tag, start offset, end offset, children)
with the invariant that range of the node is equal to disjoint uiniong of children's ranges. There's no actual data stored in the tree besides offsets, it's all recoverable from the original sourceBenefits of AST: