Instead of relying on doing a lookahead of a certain number of lines, the lexer should categorize all the lines during lexing into one of the following categories:
Question text
Editor's Notes
Question start (\d+.?\s)
Answer start (ANS(WER)?(:|.))
Part start (([|()\d+(]|))
Then the parser can use a real grammar to see what it expects the next tokens to be. Lines of question text can always be followed by more question text. We can then reject and continue if we find that, for example, a Question start + Question text isn't followed by an Answer start next.
Instead of relying on doing a lookahead of a certain number of lines, the lexer should categorize all the lines during lexing into one of the following categories:
Then the parser can use a real grammar to see what it expects the next tokens to be. Lines of question text can always be followed by more question text. We can then reject and continue if we find that, for example, a Question start + Question text isn't followed by an Answer start next.