picoe / Eto.Parse

Recursive descent LL(k) parser for .NET with Fluent API, BNF, EBNF and Gold Grammars
MIT License
148 stars 30 forks source link

Parsing in streaming-like way #17

Closed tpluscode closed 9 years ago

tpluscode commented 9 years ago

Hi. As my idea for solving #16 I though that I could listen the Matched event of line break terminal and keep track of current line and last line end's position in stream. Unfortunately I learned that events are fired only after whole text has been parsed. Is there an easy way to monitor the parsing process as it goes?

I was thinking about this also because that way it souldn't be necessary to wait until all text has been parsed before handling the matches. Or is that not such a good idea :question:

cwensley commented 9 years ago

To be involved in the parsing cycle, you would derive from the Parser (or a subclass) and override the InnerParse() method. It is done this way since adding an event for the parser at this stage would considerably degrade performance.

However, the reason why you wouldn't want to handle the matches while parsing/tokenizing (typically) is when there is a failure at the end of a group.. for example, to do as you suggest, you could implement it this way:

using System;
using Eto.Parse;
using Eto.Parse.Parsers;

namespace TestEolParsing
{
    static class Program
    {
        class MyEolTerminal : EolTerminal
        {
            public int LineCount {get;set;}
            protected override int InnerParse(ParseArgs args)
            {
                var ret = base.InnerParse(args);
                if (ret > 0)
                {
                    // found a newline!
                    LineCount++;
                }
                return ret;
            }
        }

        public static void Main(string[] args)
        {
            var eol = new MyEolTerminal();
            var grammar = new Grammar(+("some text" & ((eol & "second text") | (eol & "third text"))));

            var match = grammar.Match(@"some text
second textsome text
third textsome text
text that doesn't match");

            Console.WriteLine("Line Count: {0}, Result: {1}", eol.LineCount, match.ErrorMessage);

            return;
        }
    }
}

The Problem is, this will return a line count of 5, where the error is actually on line 3. This is because the eol is successfully parsed multiple times, but then the group the eol is in fails (the second text group and twice for the last line) so they can't be counted.

Therefore, it is still best to handle the matches after the parsing stage to ensure you are only counting valid matches.