Marwes / combine

A parser combinator library for Rust
https://docs.rs/combine/*/combine/
MIT License
1.29k stars 93 forks source link

take_until_bytes() and partial parsing #327

Closed simpsoneric closed 2 years ago

simpsoneric commented 2 years ago

I'm attempting to write a parser that searches for the start of a sync pattern in a byte stream. If the pattern doesn't exist, I'd like to remove the unmatched prefix from the input.

pub fn parser5() {

    let sync = [1u8, 2, 3, 4, 5];
    let data_in = [9, 9, 1, 2, 3, 4];

    {
        let mut din          = combine::stream::PartialStream(&data_in[..]);
        let mut pos          = 0;
        let mut prefix_range = take_until_range(&sync[..]);

        let r  = prefix_range.parse_with_state(&mut din, &mut pos);
        dbg!(r);
        assert_eq!(pos, 2);
    }
    {
        let mut din          = combine::stream::PartialStream(&data_in[..]);
        let mut pos          = 0;
        let mut prefix_bytes = take_until_bytes(&sync[..]);

        let r  = prefix_bytes.parse_with_state(&mut din, &mut pos);
        dbg!(r);
        assert_eq!(pos, 2);
    }
}

I initially attempted using the take_until_bytes() function, but this function expects a PartialState = (). This is shown by the following compile error:

215 |         let r  = prefix_bytes.parse_with_state(&mut din, &mut pos);
    |                                                          ^^^^^^^^ expected `()`, found integer
    |
    = note: expected mutable reference `&mut ()`
               found mutable reference `&mut {integer}`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `parser-tests` due to previous error

Internally, the take_until_bytes() function utilizes the take_fn() that should return a TakeRange::NotFound( amount to remove), but I can't get this to type check.

Questions:

Marwes commented 2 years ago

Fixed the state issue in https://github.com/Marwes/combine/pull/328 but you generally shouldn't call parse_with_state explicitly, just incorporate the take_until_bytes/range parser before your other parser and call parse as normal (let the state be an implementation detail. To drop the output of take_until_bytes you can just wrap that parser with ignore or .map(|_| ())

simpsoneric commented 2 years ago

Fantastic, thanks for the quick fix! My test case passes the above code and your new issue_327 branch.

Thanks for the insight on letting the state be an implementation detail wrapped by the internal state. I am struggling figure out what data structure I can use as a test source that approach though. For example:

pub fn parser5() {

    let sync = [1u8, 2, 3, 4, 5];
    let prefix_data = [9u8, 9];
    let mut buf = BytesMut::with_capacity(64);

    // Add random prefix data
    buf.extend_from_slice(&prefix_data);

    // Add partial sync sequence
    buf.extend_from_slice(&sync[..sync.len()-1]);

    let mut prefix_parser = take_until_range(&sync[..]);
    {
        // Should be able to drop first items from input stream
        let r  = prefix_parser.parse(PartialStream(&buf[..])); // <-- `buf` immutable borrow first occurs here
        dbg!(r);
    }
    {
        // Add last item from the sync sequence
        buf.put_u8(sync[sync.len()-1]); // <-- Error: cannot borrow mutable due to above immutable borrow

        let r  = prefix_parser.parse(PartialStream(&buf[..]));
        dbg!(r);
    }
}

If you don't mind giving some advice on how this should be approached, I'd appreciate it. Otherwise, I'll start digging around the code/docs some more.