charlesLoder / havarotjs

A Typescript package for getting syllabic data about Hebrew text with niqqud.
https://www.npmjs.com/package/havarotjs
MIT License
12 stars 6 forks source link

Warn instead of throw if a syllable is preceding a Cluster with a Mater. #153

Closed julianwagle closed 8 months ago

julianwagle commented 9 months ago

Hi Charles! Thanks for all the incredible work. This is just a suggestion based on personal usage. תְִּירָא֑וּם ("tî·rā·’ūm" found in Deut.3.22) is currently throwing an error when attempting to parse it. Simply logging a warning instead of throwing an error seemed to resolve all issues.

Here are three links to use for reference:

https://www.sefaria.org/Deuteronomy.3.22?lang=bi&with=all&lang2=en

https://biblehub.com/interlinear/deuteronomy/3-22.htm

https://www.stepbible.org/?q=version=MapM|reference=Deu.3.22

charlesLoder commented 9 months ago

@julianwagle

Hey and thanks for making a PR! I'm always excited to know someone is using the tool.

This word has come up before, and it is certainly unique. This will be a bit of a brain dump for future reference.

The text

Needless to say, the text in Deut 3:22 is unusual; rarely is a hiriq and sheva found under the same letter.

Here'a twitter thread I have discussing it a bit.

Different texts have different pointings. Those that align mostly with the Leningrad Codex (see image of the word) have a sheva — תְִּירָא֑וּם.

Other texts like Miqra `al pi ha-Mesorah , which Sefaria uses as their base text, have a metheg instead of a sheva — תִּֽירָא֑וּם.

In short, this is an error in the Leningrad Codex, and instead of a sheva, it should be a metheg (or ga'ya) marking the secondary accent. It's a non-standard pattern

the code

The PR doesn't pass the tests because they assume it will throw an error, and cause some other checks to fail.

That being said, this PR does raise two thoughts:

  1. If strict is false, it should pass, but it doesn't. That needs to be fixed.
  2. Maybe an option can be passed like errorLevel (maybe a better name) with values of error, warn, and off for handling errors.

The second would be more involved, but for the first, these changes seem to work:

if (nxt instanceof Syllable) {
  const word = arr.map((i) => i.text).join("");
  if (strict) {
    throw new Error(`Syllable ${nxt.text} should not precede a Cluster with a Mater in ${word}`);
  } else {
    syl.unshift(...nxt.clusters);
  }
} else {
  syl.unshift(nxt);
}

If you're willing, you can incorporate that code above and write some tests.

Thanks for your contribution!