benjamin-hodgson / Pidgin

A lightweight and fast parsing library for C#.
https://www.benjamin.pizza/Pidgin/
MIT License
914 stars 70 forks source link

Parsing scopes #130

Closed Azurelol closed 2 years ago

Azurelol commented 2 years ago

Hello. How could I write a parser that parses the text within one set of scopes?

 [TestCase("{ foobar }", " foobar ")]
        [TestCase("{}", "")]
        [TestCase("{{}}", "{}")]
        [TestCase("{{{}}}", "{{}}")]
        public void ParsesTextInScope(string text, string expected)
        {
            var parse = GroovyParser.ScopeText.Parse(text);
            AssertParse(parse);
            Assert.AreEqual(expected, parse.Value);
        }

I have been trying variations of:

        public static readonly Parser<char, string> ScopeText =
             from begin in LBrace
             from text in Any.Until(Lookahead(RBrace))
             from end in RBrace
             select string.Concat(text);

What I want to do is to parse all text within a set of braces, including nested braces. So, I have been trying to write a parser that consumes all text except the last right brace, but been failing to do so.

Azurelol commented 2 years ago

After some work, I was able to solve the initial test cases with this:

public static readonly Parser<char, string> ScopeText =
             from begin in Whitespaces.Before(LBrace)
             from before in Try(AnyCharExcept('{', '}').ManyString())
             from scopes in Try(ScopeText).Separated(Whitespaces)
             from after in Try(Any.Until(Lookahead(RBrace)))
             from end in RBrace.Before(Whitespaces)
             select $"{before}{scopes.Select(s => ($"{{{s}}}")).Join("")}{string.Concat(after)}";

However, it fails for these two:

[TestCase("{ bar { } }", " bar { } ")]
[TestCase("{ foo { bar } }", " foo { bar } ")]

With this:


 Expected string length 9 but was 8. Strings differ at index 8.
  Expected: " bar { } "
  But was:  " bar { }"
benjamin-hodgson commented 2 years ago

Try putting the Whitespace outside the recursive parser