charlesLoder / havarotjs

A Typescript package for getting syllabic data about Hebrew text with niqqud.
https://www.npmjs.com/package/havarotjs
MIT License
12 stars 4 forks source link

Make `isShureq` accurate #113

Closed m-yac closed 1 year ago

m-yac commented 1 year ago

Currently, isShureq only checks whether the cluster has a vowel, which incorrectly labels clusters with a vocal sheva (e.g. the third cluster of מְצַוְּךָ) and clusters with a preceding vowel (e.g. the second cluster of גֵּוּ) as shureqs.

This is important because if DAGESH_CHAZAQ is set to false in hebrew-transliteration, then isShureq is called to determine whether to transliterate the cluster as a SHUREQ. For example, currently:

transliterate("גֵּוּ", {DAGESH_CHAZAQ: false}) === "gēû"
transliterate("מְצַוְּךָּ", {DAGESH_CHAZAQ: false}) === "mǝṣaûǝkā"

where they should be "gēw" and "mǝṣawǝkā", respectively, to match their versions with DAGESH_CHAZAQ set to true (i.e. "gēww" and "mǝṣawwǝkā").

I also added some tests, with a few of the example words taken from these Wikipedia pages: [1] [2]