Closed cetra3 closed 4 years ago
This has been mostly dealt with the following code:
It's a cheap optimisation that checks to see if there is any \r\n
patterns and if not, find the last \r
and fast forward to it. This brings the parsing to around 2gb/s in the worse case.
@koivunej let me know if this assists your benchmarks.
Yeah I almost hacked a stateful matcher for this but the result was so unreadable I think a some refactoring needs to happen before starting on that path again.
I think the first refactoring could be that any code using the self_mut: &mut Self
in the MultistreamParser
should be refactored away into a method of (&mut self, cx: &mut std::task::Context<'_>)
just to simplify the field access (I was tripping on self
vs. self_mut
a lot). Perhaps the new needed (sub)states for streaming content are then more apparent, and can be used to reduce the lines of code and repetition needed for this more complicated matching.
Regarding the suggested b"\r"
=> b"\r\n"
change.. I think it's a step in the right direction, but a full solution would need to handle all of the partial matches as elaborated in the "stateful matcher idea".
For a real worst-case benchmark I think any permutation of "\r\n--$boundary\r\n" where a single byte is wrong (for example replaced with b"0"
) would do. In addition to that, the mismatches should probably happen on the "other side" of the boundary for example, given the boundary of "ABCD":
chunk 1: any preceding bytes...\r\n--ABC
chunk 2: C0 any following bytes
.
You're right, looking for the entire boundary is a pretty decent idea and may end up being fastest generally.
Assuming this is generally what you're talking about:
"\r\n--$boundary"
index
+ needle_len
+ 2
is out of bounds then pump more bytes\r\n
or --
& do what needs to be done\r
(start of partial match):
buffer_len
- r_ind
) (i.e, &buffer[r_ind..] == &needle[..len]
) & pump itAlternative parser is here:
This actually performs worse than the existing code for some reason. I think that looking for the full needle is too heavy.
After doing some tests, the new method is actually more consistent across the board.
In the zero byte case it is slower, but I have added in a random byte case and \r
case in which it is faster in both cases. I think the random case is probably more true to life than the other tests.
I will push the alternative to master & release as 0.5.0
unless there are any further things you have in mind.
Sorry for late response, thank you for pushing the update. It's still a good step in the right direction.
Just raising this in relation to #11 . The
\r
bytes appear to trip up the parser in some scenarios still. Raising this to work through it.