Order of diacritic signs: NUKTA before or after VOWEL ?

tfbf / Bible-Punjabi-Pavitr-Bible-1945

Bible-Punjabi-Pavitr-Bible-1945

Other

5 stars 9 forks source link

Order of diacritic signs: NUKTA before or after VOWEL ? #89

Open DavidHaslam opened 7 years ago

DavidHaslam commented 7 years ago

In most cases, where a Gurmukhi glyph contains a NUKTA sign, the vowel sign (if present) comes after the NUKTA.

There are a two instances where the vowel comes before the NUKTA.

Unicode Normalization does not change the order of these diacritics.

Even so, would it be sensible to change these two exceptions to have the NUKTA immediately after the letter? This would ensure that they should not fall outside a related search due to being in the peculiar order. NB. Such a change does alter the appearance of the glyphs. The NUKTA dot gets moved leftwards.

Though there are no instances of the former, there are already instances of the latter elsewhere in the text:

Further observations:

The ADDAK, BINDI or TIPPI sign normally comes after the vowel sign if both are present.
The NUKTA normally comes before the ADDAK, BINDI, TIPPI or VIRAMA if both signs are in the glyph.

DavidHaslam commented 7 years ago

The two exceptions are located as follows:

Psalm 42:9 which reads:

\v 9 ਪਰਮੇਸ਼ੁਰ ਨੂੰ ਜੋ ਮੇਰੀ ਚਟਾਨ ਹੈ ਮੈ ਆਖਾਂਗਾ , ਤੂੰ ਮੈਨੂੰ ਕਿਉਂ ਭੁੱਲ ਗਿਆ ਹੈਂ ? ਮੈ ਕਿਉਂ ਵੈਰੀ ਦੇ ਅਨੇ਼ਰ ਦੇ ਮਾਰੇ ਵਿਰਲਾਪ ਕਰਦਾ ਫਿਰਦਾ ਹਾਂ ?

Zechariah 9:5 which reads:

\v 5 ਅਸ਼ਕਲੋਨ ਵੇਖੇਗਾ ਅਤੇ ਡਰ ਜਾਵੇਗਾ , ਨਾਲੇ ਅਜਾ਼ਹ ਵੀ ਕਿਉਂ ਜੋ ਉਹ ਨੂੰ ਡਾਢੀ ਪੀੜ ਲੱਗੇਗੀ , ਨਾਲੇ ਅਕਰੋਨ ਵੀ ਕਿਉਂ ਜੋ ਉਹ ਦਾ ਭਰੋਸਾ ਸ਼ਰਮਿੰਦਾ ਹੋ ਜਾਵੇਗਾ ,ਅੱਜਾਹ ਵਿੱਚੋਂ ਰਾਜਾ ਮਿਟ ਜਾਵੇਗਾ, ਅਸ਼ਕਲੋਨ ਬੈ ਆਬਾਦ ਹੋ ਜਾਵੇਗਾ |

Both locations should be checked in the PDF file in order to review my suggested change.

DavidHaslam commented 7 years ago

To put the two exceptions in context of the other glyphs that contain a NUKTA, here is the counted data:

These are the only letters found to have a NUKTA in a glyph (without Unicode Normalization) LA is shown as grey here because there were none in the Punjabi Bible text. cf. These are the six composite characters that Normalization converts to a letter and a separate NUKTA. One might be tempted to conclude the the Gurmukhi block is short of 4 composite characters. On the other hand, the very low counts in the corresponding cells might suggest that each of these instances should also be reviewed.

DavidHaslam commented 7 years ago

cf. Some other Unicode scripts do have a canonical order for the diacritics.

e.g. Biblical Hebrew, but even so, there is a known issue with Normalization that is described in page 9 of the SBL Hebrew Font Manual.

DavidHaslam commented 7 years ago