charlesLoder / havarotjs

A Typescript package for getting syllabic data about Hebrew text with niqqud.
https://www.npmjs.com/package/havarotjs
MIT License
12 stars 6 forks source link

Option for Modern Hebrew Syllabification #4

Open charlesLoder opened 4 years ago

charlesLoder commented 4 years ago

Currently, havarot syllabifies words according to Traditional (i.e. Sephardic) or Tiberian rules. The ability to syllabify word according to general Modern Hebrew pronunciation would be beneficial, especially for augmenting with transliteration schemas that follow Modern Hebrew

Differences

Syllable Properties

Syllable.medial

In issue #2, it is proposed to introduce more linguistic properties to syllables. Modern Hebrew differs in it's syllable properties

A medial property would need to be included:

Syllable.medial: string | null

Modern Hebrew allows for syllable types of CCV and CCVC.

E.g. גְּדֹולִים is realized as [gdo. 'lim]

Syllable.onset

For syllables beginning with א, ע, or ה, the onset can be realized as null. Though, orthographically, they do function like an onset.

Realization of Shewa

In Biblical Hebrew reading traditions, the shewa is often vocalic, but in Modern Hebrew it is often realized as a zero-vowel [Ø] (Coffin and Bolozky, A Reference Grammar of Modern Hebrew, 22), creating syllables of CCV or CCVC types (see above)

The most common times that a word-initial (maybe syllable-initial) shewa is realized as vocalic is when (1) it's onset is a י, ל, מ, נ, or ר, or (2) when the second letter is א, ה, or ע.

Example of (1):

Example of (2):

A shewa preceded by a shewa is typically vocal as well, just like TIberian, but not necessarily so

charlesLoder commented 1 month ago

The structure() method returns [string, string, string]. In order to implement this, may want to change the output to an object, though that would technically be a breaking change, this is still v0, so I guess it's fine