Closed modulovalue closed 2 years ago
The event based parser (which is now used for all parsing operations) has an option to keep track of source locations of individual events. While this does not allow to keep track of exact positions of each token, it gives enough information for things like error reporting, printing the location of elements, or even some basic syntax highlighting.
Speed and memory consumption are a general concern. There are users that want to be able to process GBs of XML data on mobile devices. So yeah, I am very careful of adding features that are not generally useful but that come at a high cost for everybody.
I played around propagating the location information from events to the DOM nodes, which is relatively easy to do in the current setup. However, I didn't pursue the idea further due to the lack of a strong user-case and the question what would happen with the location data if the DOM was mutated?
Thank you for the response. Given the requirement that GBs of data need to be parsed, forcing all users to parse into a CST sounds like it would definitely introduce an unacceptable amount of overhead.
Hello Lukas,
I was wondering what your opinion is on adding support for parsing xml documents into a concrete syntax tree i.e. an AST that contains location information such as the source location of equals signs in attributes and other syntactical elements.
Is this perhaps on your TODO list? if not, would you ever accept any PRs for that, even if it made use-cases that don't need a CST slower, or should a
to-CST-parser
exist independently of theto-AST-parser
to not make theto-AST-parser
slower?