parsica-php / parsica

Parsica - PHP Parser Combinators - The easiest way to build robust parsers.
https://parsica-php.github.io/
MIT License
405 stars 18 forks source link

prevent use ob mb_* functions on non-multibyte strings - 2,8% perf gain #30

Closed staabm closed 3 years ago

staabm commented 3 years ago

this PR does 2 things

grafik

turanct commented 3 years ago

I quickly implemented a second string stream which doesn't work with mb_* functions, but just uses the default string manipulations. This indeed gives us quite a nice speed improvement (much better even than the 2.8% you measured):

Screenshot 2021-03-13 at 22 03 11

This also gives the user more control over which implementation is used. I'm not sure if we should risk weird behavior during parsing by automatically switching functions based on the byte length of the input. If we do, we should probably have some proper tests to cover for this.

I'll be running some experiments on this, to see what the best solution would be.

thanks a lot!

staabm commented 3 years ago

Btw. Just found https://github.com/parsica-php/parsica/pull/28 which implements a similar approach

turanct commented 3 years ago

closing in favor of #28

I would like to have the user explicitly chose which type of stream they provide, to make sure we don't have unexpected behavior within StringStream