Towards a "Token File" abstraction

kjosib / booze-tools

Booze Tools will become the complete programming-language development workbench, all written in Python 3.9 (for now).

MIT License

14 stars 1 forks source link

In days of multi-pass yore, a lexical analysis phase would operate completely independently of the parse and produce a disk-file containing token information (including all diagnostic data for decent error reporting); this file would then feed into a parser to generate an AST file.

Today the intermediate disk-file is neither necessary nor advisable: ram is spacious. However, the concept points the way to a good style of writing scanner-actions: they should enqueue the tokens they find, and the supporting infrastructure should handle everything about location tracking. This facilitates indent-grammars where the absence of whitespace results in potentially several zero-width outdent tokens. But instead of a true queue, the framework might reasonably retain all the token data in an array for later reference by sequence number. That and dollar will buy you a cup of error reporting.

kjosib / booze-tools

Towards a "Token File" abstraction #27