Simple text sentence splitting and counting. Supports atleast english, german and dutch, possibly more. If you find it works well enough for your language, please let me know!
MIT License
78
stars
23
forks
source link
PHP Warning: mb_ereg_replace(): mbregex search failure in php_mbereg_replace_exec(): retry-limit-in-match over in /vendor/vanderlee/php-sentence/src/Multibyte.php on line 59 #27
I have no idea why it fails. I'm also not 100% sure why a UTF-8 aware method is needed here. Is it to remove non-ASCII whitespaces like non-breaking or half-width spaces? Would using preg_replace with the /u flag be an alternative?
I get above warning for some of my content. The problem happens in the trim() method:
https://github.com/vanderlee/php-sentence/blob/ed7ce41ef815bd21e61f62b692418740f988451f/src/Multibyte.php#L57-L60
When the warning is throw, mb_ereg_replace will return false instead of a string, breaking the rest of the script.
I created a test case to demonstrate the problem: https://github.com/splitbrain-forks/php-sentence/commit/914dd45bf3c77e9c1069e754cf53df911ca66a16
I have no idea why it fails. I'm also not 100% sure why a UTF-8 aware method is needed here. Is it to remove non-ASCII whitespaces like non-breaking or half-width spaces? Would using preg_replace with the /u flag be an alternative?