Closed staabm closed 3 years ago
I quickly implemented a second string stream which doesn't work with mb_* functions, but just uses the default string manipulations. This indeed gives us quite a nice speed improvement (much better even than the 2.8% you measured):
This also gives the user more control over which implementation is used. I'm not sure if we should risk weird behavior during parsing by automatically switching functions based on the byte length of the input. If we do, we should probably have some proper tests to cover for this.
I'll be running some experiments on this, to see what the best solution would be.
thanks a lot!
Btw. Just found https://github.com/parsica-php/parsica/pull/28 which implements a similar approach
closing in favor of #28
I would like to have the user explicitly chose which type of stream they provide, to make sure we don't have unexpected behavior within StringStream
this PR does 2 things
mb_*
functions only, when the string beeing streamed contains multibyte chars. that way we can use way faster non mb-string functions on non-multibyte strings