itod / pegkit

'Parsing Expression Grammar' toolkit for Cocoa/Objective-C
MIT License
392 stars 37 forks source link

'{ MATCHES(@"\\n", LS(1)) }? S' will generate '[self matchS:NO];' instead of '[self matchWhitespace:NO];' #14

Closed joemcbride closed 10 years ago

joemcbride commented 10 years ago

This sample grammar demonstrates the issue where ParseGenApp creates invalid code. My eol rule is an attempt to get PEGKit to match newline characters. Maybe there's an easier way?

  program
  @before {
    PKTokenizer *t = self.tokenizer;

    // whitespace
    self.silentlyConsumesWhitespace = NO;
    t.whitespaceState.reportsWhitespaceTokens = YES;
  //  self.assembly.preservesWhitespaceTokens = YES;

    [t.symbolState add:@"\\n"];

    // setup comments
    t.commentState.reportsCommentTokens = YES;
    [t.commentState addSingleLineStartMarker:@"//"];
    [t.commentState addMultiLineStartMarker:@"/*" endMarker:@"*/"];
  }
    = eol;

  eol
    = { MATCHES(@"\\n", LS(1)) }? S
    ;
itod commented 10 years ago

Thx!

Fixed in c5ff94b7d194bd90636fd525744025a5067447e4

Here's my test case grammar, it should be instructive:

@before {
    PKTokenizer *t = self.tokenizer;

    // whitespace
    self.silentlyConsumesWhitespace = NO;
    t.whitespaceState.reportsWhitespaceTokens = YES;

    // NOTE: mated `S` (i.e. whitespace) tokens will never be preserved by this parser's assembly, unless you turn on the `preservesWhitespaceTokens` below
    // So by default, it is as if all `S` references were actually defined as `S!`. Not sure I still like this default, but that's how it is for now.
    //self.assembly.preservesWhitespaceTokens = YES;
}

lines = line+;
line  = ~eol* eol; // note the `~` Not unary operator. this means "zero or more NON eol tokens, followed by a single eol token"
eol   = { EQ(@"\\n", LS(1)) }? S;
joemcbride commented 10 years ago

Thank you!