Closed PhotoNomad0 closed 9 months ago
Notes:
do exact compare first and if not exact match fall back to doing inexact compare:
RIGHT SINGLE QUOTATION MARK
with MODIFIER LETTER APOSTROPHE
(about 10% of mismatches)replacing other changes probably have too much risk unless there is a way to force manual review.
scripture-resource-rcl calls tokenize from string-punctuation-tokenizer this for normalization:
return tokenize({
text: text,
greedy: true,
normalize: true,
});
which in turn calls normalizer(string);
from string-punctuation-tokenizer
@elsylambert
Fix is in translationCore 3.4.1 (5ed21f7)
Test by enabling pre-release and then selecting for download:
and then for en:
Start with a fully aligned titus book https://git.door43.org/photonomad1/en_migr_tit_book (should already have WA GL of en door43-catalog selected, and should be fully aligned with no invalidations)
when you change GL to en unfoldingWord, all the capitalization changes and punctuation fixes should be migrated automatically.
Launch WA and after migration should only see invalidations on these three verses:
One of the two καὶ
was removed:
Word change:
Word change:
Then change GL back to en door43-catalog and launch WA again - should only see invalidations on the same three verses.
Looks good and works as expected(as per the test instructions above^^) in translationCore 3.4.1 (https://github.com/unfoldingWord/translationCore/commit/5ed21f7da0dccdf7118bcf66129942ff68e92762) . In the last step when we change GL back to en door-43 catalog and launch WA again, I see invalidations on the same three verses.
Note:
The other changes were:
καὶ
σωφρονίζωσι
to σωφρονίζουσιν
παλινγενεσίας
to παλιγγενεσίας
Some of the SR Greek changes seen that break alignments and could be automatically handled are:
κατ’
where2019 ’ RIGHT SINGLE QUOTATION MARK
was replaced with02BC ʼ MODIFIER LETTER APOSTROPHE