m4rw3r / chomp

A fast monadic-style parser combinator designed to work on stable Rust.
Apache License 2.0
244 stars 19 forks source link

How do I examine success/fail? #42

Closed MarkSwanson closed 8 years ago

MarkSwanson commented 8 years ago
let parse_result = parse!{i;
..
};

// I now have to execute some Rust code to see what parser I should call next.
let input2: Input<'i, u8> = match parse_result.into_inner() {
    // stuck here.
    // Ok(o) => o,
    // Err(e) => return parse_result
};
```rust

I'm a bit lost walking through the types. I simply want to continue with an Input, or return the parse_result.
Any help would be appreciated.
Thanks!
m4rw3r commented 8 years ago

Best way is to use ParseReault::bind' to chain another action:

let parse_result = parse_something(i);

parse_result.bind(|input, value|
    // This will be executed in the case we have successfully parsed so far
    match value {
        Type::Escape => parse_escape(input),
        Type::Quotes => parse_quoted(input),
        // All branches need to return some kind of ParseResult, here we error
        _ => input.err(ErrorValue),
    })

This provides the input state (without you having to worry about more than passim for it on) and the success value on the success branch. The error branch can be reached by using ParseResult::map_err.

m4rw3r commented 8 years ago

http://m4rw3r.github.io/chomp/chomp/macro.parse!.html#inline is also an alternative if you want to embed the rust code in a parse! macro. In that case you do not need to split it into blocks

MarkSwanson commented 8 years ago

I use inline a fair bit, but when I tried to explicitly return an err I kept getting macro errors :-)

I walked backwards through the chomp source but had trouble constructing an appropriate error because the compiler couldn't use '_'. I thought I was explicitly defining the types everywhere, but I ran out of time and appealed to authority :-)

Will try bind() and map_err() shortly.

m4rw3r commented 8 years ago

If there are any specific macro errors, please copy them here and I can determine if it is a bug or not. If it isn't I'll try my best to explain it since rustc is not very good at explaining macro errors, especially in incrementally parsing macros.

The parse! macro tries to avoid type-declarations, instead everything is built using the ParseResult::bind and combinators::or (these provide the building-blocks for inference) as well as Input::ret and Input::err. The reason for this is twofold, firstly it is somewhat hard to propagate types inside of macros and secondly it will be impossible to use lifetimes in any return-type (both value and error) if the parse! macro declared lifetimes because of hygiene.

The _ type error is most likely caused by a value or error type which cannot be inferred. Eg. if you only return an error you still need to declare the value type in most cases, the reverse goes for the value. A bit more so actually since ParseResult::bind automatically converts the error type using From to the required error (which is why declaring the return type of a parser is a good idea and sometimes required). A bit sad that the rust team decided that type-parameter defaults were to be removed from functions and methods, that would help a bit with the inference.

Inference works in some cases, but sometimes you need to help out a bit:

// The `parse!` macro has a notation to allow this using the @ symbol:
parse!{i; ret @ ReturnType, ErrorType: ReturnType::Value }
// Usually it is not needed to specify the type of what you are returning, but only what
// you are not returning:
parse!{i; err @ MyType, _: TheError }

// Similar things can be done outside of the `parse!` macro:
Input::ret::<ReturnType, ErrorType>(value)
Input::err::<MyType, _>(TheError)
MarkSwanson commented 8 years ago

This is indeed where I am stuck (inferring the error type). rustc isn't helping because it is suggesting I use structs that I can't actually use because they are private.

I did read the section on err @ ... but I don't grok the syntax - so maybe everything below is because of that. ? I don't know how MyType and TheError are used. err @ parsers::Error , _: (); // ?

Maybe simpler: I still don't know what to do given a simple parser fn:

pub fn parse_return<'i, 'a>(i: Input<'i, u8>) -> U8Result<'i, ()> {
    let parse_result = parse!{i;
            take_while(is_whitespace);
            string(b"return");
        let sub_fn = take_till(is_left_paren);
            take(1);
        input -> {
            // do something
            input.ret(())
        }
    };
}

All I want to do now is return an error if parser_result != success. How would I do that? I can match parser_result.into_inner() well enough, but I still can't create a U8Result as an error.

Alternatively, I attempted to use:

    parse_result.map_err(|e| {
        // <chomp macros>:5:60: 5:65 error: the trait `core::convert::From<chomp::parsers::Error<u8>>` is not implemented for the type `()` [E0277]
        return e
    }).bind(|input, value| {
        input.ret(())
    });

}

I see that the error type for U8Result is:

pub type U8Result<'a, T>        = ParseResult<'a, u8, T, parsers::Error<u8>>;

However, I don't see how to create one - or set an existing U8Result to have a parsers::Error? I cant' match on parse_result.into_inner() because that takes ownership and I have further problems with creating the resulting ParseResult with private ...

I tried just embedding / nesting everything inside of an input -> {} section but I still have to short-circuit return errors inside the nested parser!() - and I can't seem to make that work.

                    let _r2 = parse!{i; ... };
                    let parse_response = match _r2.into_inner() {
                        State::Data(i, t) => (),
                        _ => return i.err(TODO) // how? needs to return chomp::parsers::Error<u8>
//  expected `chomp::parse_result::ParseResult<'i, u8, (), chomp::parsers::Error<u8>>`,
//    found `chomp::parse_result::ParseResult<'_, u8, (), ()>`
                    };

I may be repeating myself :-)

MarkSwanson commented 8 years ago

Ok, so after posting that I moved things around and have it working (I think). I think my main problem was parsing macro errors. I was having trouble walking backwards and figuring out the types. When I'm sure I have it working I'll try to circle back and provide a useful summary.

m4rw3r commented 8 years ago

The standard error type of Chomp parsers is chomp::Error<I>, there is a way to use that directly:

use chomp::Error;
input.err(Error::new())  // Creates and wraps an "Unexpected" error

This is useful when building a parser as it neatly fits the required type. However it does not carry much information about what actually was expected (it is more or less just expected X, or did not expect what it got).

There is a reason for the types to be somewhat opaque: http://m4rw3r.github.io/parser-combinators-road-chomp-0-1/#input-and-parseresult-are-opaque-linear-types It will prevent a lot of accidental errors and makes for simpler code (most of the time). It can be a bit cumbersome to use at first though.

In general though I recommend something more like this:

enum MyError<'i> {
    // more verbose error types here
    InvalidCommand(&'i [u8]),
    InvalidUTF8(std::str::Utf8Error),
    // propagate chomp primitive parse errors:
    ParseError(chomp::Error<u8>),
}

impl From<chomp::Error<u8>> for MyError {
    fn from(e: chomp::Error<u8>) -> Self {
        MyError::ParseError(e)
    }
}

The above will enable you to use the more specific error instead of just relying on the fairly blunt errors which Chomp provides. You will use a type like ParseResult<u8, _, MyError> for your parsers and Chomp's ParseResult::bind will do most of the conversions, a manual From::from call should rarely be required.

// usually lifetime annotations are not required, elision goes a long way
pub fn parse_return(i: Input<u8>) -> ParseResult<u8, (), MyError> {
    parse!{i;
            take_while(is_whitespace);
            string(b"return");
        let sub_fn = take_till(is_left_paren);
            token(b'(');
        // up until here the type is ParseResult<u8, ..., chomp::Error<u8>>
        // but since we bind with an expression which results in an error of `MyError`
        // bind will convert the `chomp::Error`to this type if a suitable From
        // implementation exists
        // same will happen anyway since the return type requires a `MyError`
        input -> match sub_fn {
            // these functions are `Fn(Input<u8>) -> ParseResult<u8, (), MyError>`
            b"test"  => parse_test(input),
            b"bench" => parse_benchmark(input),
            _        => input.err(MyError::InvalidCommand(sub_fn)),
        }
    }
}