Open DavidHaslam opened 7 years ago
The two exceptions are located as follows:
Psalm 42:9 which reads:
\v 9 ਪਰਮੇਸ਼ੁਰ ਨੂੰ ਜੋ ਮੇਰੀ ਚਟਾਨ ਹੈ ਮੈ ਆਖਾਂਗਾ , ਤੂੰ ਮੈਨੂੰ ਕਿਉਂ ਭੁੱਲ ਗਿਆ ਹੈਂ ? ਮੈ ਕਿਉਂ ਵੈਰੀ ਦੇ ਅਨੇ਼ਰ ਦੇ ਮਾਰੇ ਵਿਰਲਾਪ ਕਰਦਾ ਫਿਰਦਾ ਹਾਂ ?
Zechariah 9:5 which reads:
\v 5 ਅਸ਼ਕਲੋਨ ਵੇਖੇਗਾ ਅਤੇ ਡਰ ਜਾਵੇਗਾ , ਨਾਲੇ ਅਜਾ਼ਹ ਵੀ ਕਿਉਂ ਜੋ ਉਹ ਨੂੰ ਡਾਢੀ ਪੀੜ ਲੱਗੇਗੀ , ਨਾਲੇ ਅਕਰੋਨ ਵੀ ਕਿਉਂ ਜੋ ਉਹ ਦਾ ਭਰੋਸਾ ਸ਼ਰਮਿੰਦਾ ਹੋ ਜਾਵੇਗਾ ,ਅੱਜਾਹ ਵਿੱਚੋਂ ਰਾਜਾ ਮਿਟ ਜਾਵੇਗਾ, ਅਸ਼ਕਲੋਨ ਬੈ ਆਬਾਦ ਹੋ ਜਾਵੇਗਾ |
Both locations should be checked in the PDF file in order to review my suggested change.
To put the two exceptions in context of the other glyphs that contain a NUKTA, here is the counted data:
These are the only letters found to have a NUKTA in a glyph (without Unicode Normalization) LA is shown as grey here because there were none in the Punjabi Bible text. cf. These are the six composite characters that Normalization converts to a letter and a separate NUKTA. One might be tempted to conclude the the Gurmukhi block is short of 4 composite characters. On the other hand, the very low counts in the corresponding cells might suggest that each of these instances should also be reviewed.
cf. Some other Unicode scripts do have a canonical order for the diacritics.
e.g. Biblical Hebrew, but even so, there is a known issue with Normalization that is described in page 9 of the SBL Hebrew Font Manual.
See also issue #44
See also issue #109
The two suspect glyphs are confirmed as being invalid in that the SWORD filter algorithmic transliteration by ICU also barfed at these locations.
Psalms 42:9: paramēśura nū jō mērī caṭāna hai mai ākhāṅgā , tū mainū ki'uṁ bhula gi'ā haiṁ ? mai ki'uṁ vairī dē anēra dē mārē viralāpa karadā phiradā hāṁ ?
Zechariah 9:5: aśakalōna vēkhēgā atē ḍara jāvēgā , nālē ajāha vī ki'uṁ jō uha nū ḍāḍhī pīṛa lagēgī , nālē akarōna vī ki'uṁ jō uha dā bharōsā śaramidā hō jāvēgā , ajāha vicōṁ rājā miṭa jāvēgā, aśakalōna bai ābāda hō jāvēgā
.
In most cases, where a Gurmukhi glyph contains a NUKTA sign, the vowel sign (if present) comes after the NUKTA.
There are a two instances where the vowel comes before the NUKTA.
Unicode Normalization does not change the order of these diacritics.
Even so, would it be sensible to change these two exceptions to have the NUKTA immediately after the letter? This would ensure that they should not fall outside a related search due to being in the peculiar order. NB. Such a change does alter the appearance of the glyphs. The NUKTA dot gets moved leftwards.
Though there are no instances of the former, there are already instances of the latter elsewhere in the text:
Further observations: