charlesLoder / hebrew-transliteration

A tool for transliterating Hebrew
https://www.npmjs.com/package/hebrew-transliteration
MIT License
37 stars 14 forks source link

Suggestion: DAGESH_CHAZAQ character addition #41

Closed asherlporetz closed 1 year ago

asherlporetz commented 1 year ago

Is it possible to overload DAGESH_CHAZAQ to accept a character, like a combining circumflex for example. And it will be applied to any dagesh forte (but not lene) and to mappiq He as well. Thank you.

asherlporetz commented 1 year ago

Also for STRESS_MARKER, maybe add a field to specify not to add the mark if it's at the expected/default location which is the last syllable. "always": false, for the default. "always": true, for the current behavior.

charlesLoder commented 1 year ago

Thanks for the ideas! I recently had a new addition to my family so work on this project has been slowed.

Is it possible to overload DAGESH_CHAZAQ to accept a character, like a combining circumflex for example. And it will be applied to any dagesh forte (but not lene) and to mappiq He as well. Thank you.

That could be done.

As an exmple:

transliterate("שַׁבָּת", { DAGESH_CHAZAQ: "̂" }) ;

// šab̂āt

If that's not what you're suggesting, please put a sample.


Also for STRESS_MARKER, maybe add a field to specify not to add the mark if it's at the expected/default location which is the last syllable. "always": false, for the default. "always": true, for the current behavior.

Is this what you're suggesting:

transliterate("שַׁבָּת אֶרֶץ",  { STRESS_MARKER : {
  location:'after-vowel', 
  mark: '\u0301',
  always: false
});

// šabbāt ʾéreṣ
asherlporetz commented 1 year ago

@charlesLoder exactly what I was trying to convey. Thank you.

BillMeyerRSA commented 1 year ago

That would be helpful. It might be problematic at times with combining diacritical when using e.g. sh for SHIN or tz for TSADI - two letter.

charlesLoder commented 1 year ago

That would be helpful. It might be problematic at times with combining diacritical when using e.g. sh for SHIN or tz for TSADI - two letter.

That's a good point, but unfortunately unavoidable.

For the SBL Simple schema, ensuring that a shin/tsadi with a dagesh chazaq is doubled, requires using an additional feature.

  ADDITIONAL_FEATURES: [
    {
      FEATURE: "cluster",
      HEBREW: "\u{05E9}\u{05C1}\u{05BC}",
      TRANSLITERATION: "sh"
    },
    {
      FEATURE: "cluster",
      HEBREW: "\u{05E6}\u{05BC}",
      TRANSLITERATION: "ts"
    }
  ],

So if a user wanted to use an acute accent as a dagesh marker and use a digraph for shin/tsadi, it would have to look something like:

  DAGESH_CHAZAQ: "\u0301",
  ADDITIONAL_FEATURES: [
    {
      FEATURE: "cluster",
      HEBREW: "\u{05E9}\u{05C1}\u{05BC}",
      TRANSLITERATION: "śh" // or whatever
    },
    {
      FEATURE: "cluster",
      HEBREW: "\u{05E6}\u{05BC}",
      TRANSLITERATION: "tś" // or whatever
    }
  ],

That is some duplication of work, but a schema assumes a one-to-one correspondence between Hebrew and transliteration.