Lexical Preserving Unparser

miho / VMF-Text

Powerful Grammar-based Language Modeling Framework

Apache License 2.0

10 stars 0 forks source link

Lexical Preserving Unparser #2

Open miho opened 6 years ago

miho commented 6 years ago

Even though the current unparser performs well and is quite fast it has some drawbacks. The unparser generated from the grammar is not preserving. Whitespaces and other skipped characters which are not explicitly part of rules aren't fully preserved. Ideally the preserving unparser doesn't add much overhead.

miho commented 5 years ago

There's still one case that's not covered so far: unnamed optionals that are independent of labeled elements/rules. We need to remember whether they have been parsed or not. Otherwise the unparsed text won't be identical to the source text.

miho commented 5 years ago

To detect unnamed optionals during unparsing we need to provide additional rule info for the formatter, i.e., enhance the rule-info object, see #5 .