Closed twilco closed 5 years ago
atomicity would pervade to any rule that uses these rules
I think non-atomic rules should help with that. If I'm counting correctly there's only four rules in the beancount parser that has explicit whitespace (the INDENT
token): empty_line
, posting
, key_value_line
and posting_or_kv_list
. I think the last three can be combine into one. For instance, the posting_or_kv_list
and posting
rules could look something like (simplified):
posting_or_kv_list = @{
(indent ~ (key_value_line | posting | tag_links))*
}
posting = !{ txn_flag? ~ account ~ incomplete_amount ~ cost_spec? ~ ... ~ eol }
That is, lines that must begin with indentation is matched by an atomic rule with a non-atomic inner rule, instead of making e.g. posting
itself non-atomic.
In #5 we discovered that we are not properly parsing tokens with no space between them. For example, this works just fine in
bean-check
:This results in an error in our parser. Let's fix that.
I looked into using Pest's implicit whitespace to solve this problem, but since so many of our rules are whitespace-sensitive (required indentation in postings, key value lists, etc), atomicity would pervade to any rule that uses these rules. Making a rule atomic means we have to manually specify the whitespace, nullifying the benefit we get from implicit whitespace.
This is my first foray into Pest, so maybe I'm missing something here. Explore implicit whitespacing as a solution to this problem, and otherwise use our existing manual whitespacing scheme to support tokens that have no spaces between them.