pest-parser / ast

Apache License 2.0
80 stars 15 forks source link

csv example fails #33

Open dave-doty opened 1 month ago

dave-doty commented 1 month ago

I tried the csv example in the examples directory. The only thing I changed is to inline the grammar and the csv file content directly into the csv.rs file, but otherwise below is identical to what's in the examples directory:

#[macro_use]
extern crate pest_derive;
extern crate from_pest;
#[macro_use]
extern crate pest_ast;
extern crate pest;

mod csv {
    #[derive(Parser)]
    #[grammar_inline = r#"
field = { (ASCII_DIGIT | "." | "-")+ }
record = { field ~ ("," ~ field)* }
file = { SOI ~ (record ~ ("\r\n" | "\n"))* ~ EOI }"#]
    pub struct Parser;
}

mod ast {
    use super::csv::Rule;
    use pest::Span;

    fn span_into_str(span: Span) -> &str {
        span.as_str()
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::field))]
    pub struct Field {
        #[pest_ast(outer(with(span_into_str), with(str::parse), with(Result::unwrap)))]
        pub value: f64,
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::record))]
    pub struct Record {
        pub fields: Vec<Field>,
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::file))]
    pub struct File {
        pub records: Vec<Record>,
        _eoi: Eoi,
    }

    #[derive(Debug, FromPest)]
    #[pest_ast(rule(Rule::EOI))]
    struct Eoi;
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    use crate::ast::File;
    use from_pest::FromPest;
    use pest::Parser;

    let source = "\
65279,1179403647,1463895090
3.1415927,2.7182817,1.618034
-40,-273.15
13,42
65537";
    let mut parse_tree = csv::Parser::parse(csv::Rule::file, &source)?;
    println!("parse tree = {:#?}", parse_tree);
    let syntax_tree: File = File::from_pest(&mut parse_tree).expect("infallible");
    println!("syntax tree = {:#?}", syntax_tree);
    println!();

    let mut field_sum = 0.0;
    let mut record_count = 0;

    for record in syntax_tree.records {
        record_count += 1;
        for field in record.fields {
            field_sum += field.value;
        }
    }

    println!("Sum of fields: {}", field_sum);
    println!("Number of records: {}", record_count);

    Ok(())
}

Running it results in this error:

Error: Error { variant: ParsingError { positives: [EOI], negatives: [] }, location: Pos(75), line_col: Pos((5, 1)), path: None, line: "65537", continued_line: None }
dave-doty commented 1 month ago

I tried reverting to Pest 2.5 as in this project's Cargo.toml, but that has the same error. I'm curious if anyone has actually gotten this example to run successfully.

dave-doty commented 1 month ago

I got it to work by adding an extra newline after the final CSV record:

let source = "\
65279,1179403647,1463895090
3.1415927,2.7182817,1.618034
-40,-273.15
13,42
65537
";

But there is no terminating newline in the example CSV file, so it seems like either that should be fixed in that file. It also just seems like an error in the parser that it requires a newline at the end of the file.

It would also be nice if the example showed how to ignore whitespace, since that's such a common thing for so many file formats.

tomtau commented 1 month ago

I think that example https://github.com/pest-parser/ast/blob/master/examples/csv.pest was taken from the book https://github.com/pest-parser/book/blob/master/src/examples/csv.md where it exhibits the same behavior. So if it's fixed, it'd be good to fix in both places.