Closed aldrichtr closed 2 years ago
See my notes in #12 and #14.
This issue is still valid as a feature, because ConvertFrom-OrgMode
should collect the content for processing by other parsers, but mainly the objects are created as the lines are parsed.
Input sections
ConvertFrom-OrgMode
currently sees input as a stream of "lines" of text. Each line is evaluated against a regex and acted on "in place". To better parse the input, the function should act more like a lexer. Meaning, input should be separated into 'tokens' that are further processed (parsed) individually.Lexer functionality
A lexer, by definition, takes input and creates tokens. A token is a chunk of text from the original input, and a "tag" that identifies the type of token this is. For orgmode text, the type will be an org class (element, object, etc.) such as 'headline' ...
Orgmode buffer tokens
ConvertFrom-OrgMode
should tokenize the input by: