Simple text sentence splitting and counting. Supports atleast english, german and dutch, possibly more. If you find it works well enough for your language, please let me know!
MIT License
78
stars
23
forks
source link
Acronyms at the end of sentence are incorrectly parsed #13
The library has been really useful to us to break text into sentences. I've noticed one issue so far. Seems like if a sentence ends with an acronym at the end of the text, everything is okay, but if there's another sentence after it - it gives an incorrect result. It goes even worse if the acronym is capitalized.
Here it works fine:
$sentences = $sentenceBreaker->split('Let\'s meet at 10:00 a.m..', \Sentence::SPLIT_TRIM);
var_dump($sentences);
The library has been really useful to us to break text into sentences. I've noticed one issue so far. Seems like if a sentence ends with an acronym at the end of the text, everything is okay, but if there's another sentence after it - it gives an incorrect result. It goes even worse if the acronym is capitalized.
Here it works fine:
But fails in this one:
Here it fails with a capitalized acronym: