`\r` chars tripping up the parser

cetra3 commented 4 years ago

Just raising this in relation to #11 . The \r bytes appear to trip up the parser in some scenarios still. Raising this to work through it.

cetra3 commented 4 years ago

This has been mostly dealt with the following code:

https://github.com/cetra3/mpart-async/blob/a0c281321365ce2c714f54d23b091d0a45e1c4b8/src/server.rs#L519-L527

It's a cheap optimisation that checks to see if there is any \r\n patterns and if not, find the last \r and fast forward to it. This brings the parsing to around 2gb/s in the worse case.

@koivunej let me know if this assists your benchmarks.

koivunej commented 4 years ago

Yeah I almost hacked a stateful matcher for this but the result was so unreadable I think a some refactoring needs to happen before starting on that path again.

Stateful matcher idea

I think we should use twoway::find_bytes to find "\r\n--$boundary" and only that. If that finds nothing, we should inspect the last "\r\n--$boundary".len() bytes of a buffer from inner stream to find a "partial match" of any prefix of the needle, in code something like: ``` fn find_prefix_of_needle(needle: &[u8], buffer: &[u8]) -> Option { let l = needle.len(); let buffer = &buffer[buffer.len() - l..]; // this likely has off by one let mut matched = 0; for b in buffer { match b { b if b == needle[matched] => matched += 1, // continue matching b if b == needle[0] => matched = 1, // start a new match _ => matched = 0, // start over } } NonZeroUsize::new(matched) } ``` However continuing from previous partial matches and handling the cases when when there is not yet enough bytes available for the suffix (b"\r\n" or b"--" or error) complicated the code a lot, at which point I stopped. My string algortihm-fu is too weak to judge if this should be done with aho-corasick instead of twoway + custom DFA as shown above.

I think the first refactoring could be that any code using the self_mut: &mut Self in the MultistreamParser should be refactored away into a method of (&mut self, cx: &mut std::task::Context<'_>) just to simplify the field access (I was tripping on self vs. self_mut a lot). Perhaps the new needed (sub)states for streaming content are then more apparent, and can be used to reduce the lines of code and repetition needed for this more complicated matching.

Regarding the suggested b"\r" => b"\r\n" change.. I think it's a step in the right direction, but a full solution would need to handle all of the partial matches as elaborated in the "stateful matcher idea".

For a real worst-case benchmark I think any permutation of "\r\n--$boundary\r\n" where a single byte is wrong (for example replaced with b"0") would do. In addition to that, the mismatches should probably happen on the "other side" of the boundary for example, given the boundary of "ABCD":

chunk 1: any preceding bytes...\r\n--ABC chunk 2: C0 any following bytes.

cetra3 commented 4 years ago

You're right, looking for the entire boundary is a pretty decent idea and may end up being fastest generally.

Assuming this is generally what you're talking about:

Find the full needle "\r\n--$boundary"
If the index + needle_len + 2 is out of bounds then pump more bytes
If not check if it's a \r\n or -- & do what needs to be done
If nothing found, reverse find for \r (start of partial match):
- Then directly compare the slices (where len = buffer_len - r_ind) (i.e, &buffer[r_ind..] == &needle[..len]) & pump it
- If a non-match then return bytes & start again

cetra3 commented 4 years ago

Alternative parser is here:

https://github.com/cetra3/mpart-async/blob/5b5b23e1d4098b053b422f3e16a8393c1bb5c25d/src/server.rs#L482-L551

This actually performs worse than the existing code for some reason. I think that looking for the full needle is too heavy.

cetra3 commented 4 years ago

After doing some tests, the new method is actually more consistent across the board.

In the zero byte case it is slower, but I have added in a random byte case and \r case in which it is faster in both cases. I think the random case is probably more true to life than the other tests.

I will push the alternative to master & release as 0.5.0 unless there are any further things you have in mind.

koivunej commented 4 years ago

Sorry for late response, thank you for pushing the update. It's still a good step in the right direction.

cetra3 / mpart-async

`\r` chars tripping up the parser #12