dlang-community / Pegged

A Parsing Expression Grammar (PEG) module, using the D programming language.
534 stars 66 forks source link

The example from the README doesn't work #54

Closed Valloric closed 12 years ago

Valloric commented 12 years ago

The example from the README file does not work.

#!/usr/bin/env rdmd

import std.stdio;
import pegged.grammar;

mixin( grammar( `
Arithmetic:
    Expr     <  Factor AddExpr*
    AddExpr  <  ^('+'/'-') Factor
    Factor   <  Primary MulExpr*
    MulExpr  <  ^('*'/'/') Primary
    Primary  <  '(' Expr ')' / Number / Variable / ^'-' Primary

    Number   <~ [0-9]+
    Variable <- identifier
` ) );

void main() {
  auto tree = Arithmetic(" 0 + 123 - 456 ");
  writeln( tree ); 
  writeln( tree.matches ); // prints ["0"]
  writeln( tree.matches == ["0", "+", "123", "-", "456"] ); // prints false
}

I recommend adding more unit tests to the project so this kind of breakage doesn't happen in the future. At the very least, the examples from the docs should always work. Seeing the main example break doesn't instill a lot of confidence in the project.

This is with dmd v2.060 on Mac OS 10.8.

PhilippeSigaud commented 12 years ago

On Sun, Sep 30, 2012 at 12:59 AM, Val Markovic notifications@github.comwrote:

The example from the README file does not work.

!/usr/bin/env rdmd

import std.stdio; import pegged.grammar;

mixin( grammar( ` Arithmetic: Expr < Factor AddExpr AddExpr < ^('+'/'-') Factor Factor < Primary MulExpr MulExpr < ^('*'/'/') Primary Primary < '(' Expr ')' / Number / Variable / ^'-' Primary

Number   <~ [0-9]+
Variable <- identifier

` ) );

void main() { auto tree = Arithmetic(" 0 + 123 - 456 "); writeln( tree ); writeln( tree.matches ); // prints ["0"] writeln( tree.matches == ["0", "+", "123", "-", "456"] ); // prints false }

I recommend adding more unit tests to the project so this kind of breakage doesn't happen in the future. At the very least, the examples from the docs should always work.

Aww shucks. The grammars in the pegged/examples/ directory, including arithmetic.d, do have unit tests. For the docs, a branch merging replaced the example with a buggy version, I think. I'm not sure how I can maintain a markdown document (as required by github) and D code linked together.

Anyway, I corrected the main docs (just pushed it), and the wiki. Thanks for the head up!

Valloric commented 12 years ago

From what I can tell, you fixed the PEGGED.md file, but not the README.md file that is in the top-level directory. :)

I'd recommend having all the example code in the docs be self-contained programs that a script can then extract and run as part of your testing step. Possibly even better: have a directory with all the example code as separate programs and then the docs have placeholders for the example code; during a "doc compilation" step, the final, processed docs have the example code (or only the interesting parts of them) pasted-in. Naturally, the compilation step also compiles and tests the examples.

This way the docs are never out of sync and the examples are always correct. I believe Andrei Alexandrescu created a similar scheme for the TDPL book.

Valloric commented 12 years ago

Just tried out the new grammar:

#!/usr/bin/env rdmd

import std.stdio;
import pegged.grammar;

mixin( grammar( `
Arithmetic:
    Term     < Factor (Add / Sub)*
    Add      < "+" Factor
    Sub      < "-" Factor
    Factor   < Primary (Mul / Div)*
    Mul      < "*" Primary
    Div      < "/" Primary
    Primary  < Parens / Neg / Number / Variable
    Parens   < "(" Term ")"
    Neg      < "-" Primary
    Number   < ~([0-9]+)
    Variable <- identifier
` ) );

void main() {
  auto tree = Arithmetic(" 0 + 123 - 456 ");
  writeln( tree.matches ); // success

  tree = Arithmetic(" -0 + 123 - 456 ");
  writeln( tree.matches ); // success

  tree = Arithmetic(" +0 + 123 - 456 ");
  writeln( tree.matches ); // fail
}

It fails for a leading plus sign, which is perfectly valid arithmetic. Here's a bugfix:

mixin( grammar( `
Arithmetic:
    Term     < Factor (Add / Sub)*
    Add      < "+" Factor
    Sub      < "-" Factor
    Factor   < Primary (Mul / Div)*
    Mul      < "*" Primary
    Div      < "/" Primary
    Primary  < Parens / Neg / Pos / Number / Variable
    Parens   < "(" Term ")"
    Neg      < "-" Primary
    # New rule
    Pos      < "+" Primary
    Number   < ~([0-9]+)
    Variable <- identifier
` ) );
PhilippeSigaud commented 12 years ago

On Sun, Sep 30, 2012 at 7:06 PM, Val Markovic notifications@github.comwrote:

From what I can tell, you fixed the PEGGED.md file, but not the README.md file that is in the top-level directory. :)

That what happens when I try to code after a family lunch:)

I'd recommend having all the example code in the docs be self-contained programs that a script can then extract and run as part of your testing step. Possibly even better: have a directory with all the example code as separate programs and then the docs have placeholders for the example code; during a "doc compilation" step, the final, processed docs have the example code (or only the interesting parts of them) pasted-in. Naturally, the compilation step also compiles and tests the examples.

This way the docs are never out of sync and the examples are always correct. I believe Andrei Alexandrescu created a similar scheme for the TDPL book.

That's what I did for a template tutorial for the D programming language ( https://github.com/PhilippeSigaud/D-templates-tutorial) Every example in the text is a self-contained module (possibly importing other modules), all extracted by a D script and tested for compilation. So yes, I know how to do that. It's just it's a bit heavy for the reader, what with all the module-level expressions. Andrei hid a bit of scaffolding by using invisible LaTeX code.

It's a possibility for Pegged, though.

Oh right, I'll put an issue on this.

PhilippeSigaud commented 12 years ago

It fails for a leading plus sign, which is perfectly valid arithmetic. Here's a bugfix:

mixin( grammar( Arithmetic: Term < Factor (Add / Sub)* Add < "+" Factor Sub < "-" Factor Factor < Primary (Mul / Div)* Mul < "*" Primary Div < "/" Primary Primary < Parens / Neg / Pos / Number / Variable Parens < "(" Term ")" Neg < "-" Primary # New rule Pos < "+" Primary Number < ~([0-9]+) Variable <- identifier ) );

Thanks, I'll use that. The next step will be to code power expressions with a correct right-association :-)

Valloric commented 12 years ago

That's what I did for a template tutorial for the D programming language ( https://github.com/PhilippeSigaud/D-templates-tutorial) Every example in the text is a self-contained module (possibly importing other modules), all extracted by a D script and tested for compilation. So yes, I know how to do that. It's just it's a bit heavy for the reader, what with all the module-level expressions. Andrei hid a bit of scaffolding by using invisible LaTeX code.

That's why I recommend going the placeholders-for-examples route. Have the examples as full separate programs, with the interesting bits that you'd want to present to the reader marked with, say, special comments. You have the boilerplate, then // EXAMPLE START, then the relevant part of the code, then // EXAMPLE END, then more boilerplate and unit tests. Then only the code between the magic comments gets pasted into the final docs, the boilerplate does not.

Another benefit of this approach is that anyone can just go into the examples dir and play around with the actual running code from the docs and sees all the boilerplate too (which is a benefit when you're trying to build a working program from the code in the docs).

PhilippeSigaud commented 12 years ago

That's why I recommend going the placeholders-for-examples route. Have the examples as full separate programs, with the interesting bits that you'd want to present to the reader marked with, say, special comments. You have the boilerplate, then // EXAMPLE START, then the relevant part of the code, then // EXAMPLE END, then more boilerplate and unit tests. Then only the code between the magic comments gets pasted into the final docs, the boilerplate does not.

I see, interesting. I think I'll do that.

That also means putting 'raw' docs somewhere, and 'filled' docs elsewhere, because I find it interesting to give Markdown files to users, if they want to generate HTML files and such.

Valloric commented 12 years ago

Yes, it would be ideal if you could have both 'raw' markdown files along with the processed ones in the repo.

On Sun, Sep 30, 2012 at 12:17 PM, Philippe Sigaud notifications@github.comwrote:

That's why I recommend going the placeholders-for-examples route. Have the examples as full separate programs, with the interesting bits that you'd want to present to the reader marked with, say, special comments. You have the boilerplate, then // EXAMPLE START, then the relevant part of the code, then // EXAMPLE END, then more boilerplate and unit tests. Then only the code between the magic comments gets pasted into the final docs, the boilerplate does not.

I see, interesting. I think I'll do that.

That also means putting 'raw' docs somewhere, and 'filled' docs elsewhere, because I find it interesting to give Markdown files to users, if they want to generate HTML files and such.

— Reply to this email directly or view it on GitHubhttps://github.com/PhilippeSigaud/Pegged/issues/54#issuecomment-9016852.