Closed gacou54 closed 2 years ago
The following logic is used to prevent a sentence count on acronyms in feature/coleman-liau-index
branch for the Coleman-Liau Index. It works for the acronyms with multiples dots.
else if (c == '.')
if (i == length - 1) // This is the end of the text
nbrOfSentences++;
// This logic excludes the acronym with two dot (e.g. "The U.S. Office is here.").
// It looks for another dot two characters before a dot with a following space ". ".
else if (text.charAt(i + 1) == ' ')
if (i != 1 && i != 2)
if (!text.substring(i-2, i).contains("."))
nbrOfSentences++;
Acronyms with a single dot are not handled right now (e.g. etc.
will be counted as a sentence end).
Closing this issue, I consider the solution OK.
Some algorithms count the number of sentences. When acronyms are encounter, they can be wrongly counted as a sentence. This is because the sentence counter uses a dot with a following space as an indicator of a sentence end (
.
).For example, the following sentence counts as two.
The U.S. Office is here.