unfoldingWord / translationCore

Repository for the desktop application translationCore
https://www.translationcore.com
Other
36 stars 11 forks source link

On change of original language in WA tool, do automatic alignment mapping for capitalization changes and accent fixes/normalization. #7471

Closed PhotoNomad0 closed 9 months ago

PhotoNomad0 commented 1 year ago

Some of the SR Greek changes seen that break alignments and could be automatically handled are:

PhotoNomad0 commented 1 year ago

Notes:

scripture-resource-rcl calls tokenize from string-punctuation-tokenizer this for normalization:

  return tokenize({
    text: text,
    greedy: true,
    normalize: true,
  });

which in turn calls normalizer(string); from string-punctuation-tokenizer

PhotoNomad0 commented 1 year ago

@elsylambert

Fix is in translationCore 3.4.1 (5ed21f7)

Test by enabling pre-release and then selecting for download:

Screenshot 2023-01-06 at 7 48 50 PM

and then for en:

Screenshot 2023-01-06 at 7 49 23 PM

Start with a fully aligned titus book https://git.door43.org/photonomad1/en_migr_tit_book (should already have WA GL of en door43-catalog selected, and should be fully aligned with no invalidations)

when you change GL to en unfoldingWord, all the capitalization changes and punctuation fixes should be migrated automatically.

Launch WA and after migration should only see invalidations on these three verses:

One of the two καὶ was removed: Screenshot 2023-01-10 at 9 51 21 AM

Word change: Screenshot 2023-01-10 at 9 51 40 AM

Word change: Screenshot 2023-01-10 at 9 52 06 AM

Then change GL back to en door43-catalog and launch WA again - should only see invalidations on the same three verses.

elsylambert commented 1 year ago

Looks good and works as expected(as per the test instructions above^^) in translationCore 3.4.1 (https://github.com/unfoldingWord/translationCore/commit/5ed21f7da0dccdf7118bcf66129942ff68e92762) . In the last step when we change GL back to en door-43 catalog and launch WA again, I see invalidations on the same three verses.

PhotoNomad0 commented 1 year ago

Note:

The other changes were: