Apostrophes ʼ are not parsed correctly - sometimes they appear in pairs to mark quotations. The second apostrophe usually gets assigned to the following sentence and if there is none (-> end of chapter), it will be assigned its own sentence with length=1. You can find those locations searching for "1".*\n\s{3}</.
I am not familiar with the sentence id schema here - how can we fix a bug that affects sentence segmentation?
This is a bug in the old Perseus segmentation code and something that should be noted as a requirement for the Annotation Service and any tokenization services we use in Perseids.
Apostrophes
ʼ
are not parsed correctly - sometimes they appear in pairs to mark quotations. The second apostrophe usually gets assigned to the following sentence and if there is none (-> end of chapter), it will be assigned its own sentence with length=1. You can find those locations searching for"1".*\n\s{3}</
.I am not familiar with the sentence id schema here - how can we fix a bug that affects sentence segmentation?