charlesLoder / havarotjs

A Typescript package for getting syllabic data about Hebrew text with niqqud.
https://www.npmjs.com/package/havarotjs
MIT License
12 stars 6 forks source link

Jerusalem: phantom yod #165

Closed charlesLoder closed 4 months ago

charlesLoder commented 4 months ago

See comment from @bdenckla

Regarding yerushalayim and yerushalaymah, you have the opportunity to do something I never had the guts to do, which is to introduce the notion of a "phantom yod" to hold those orphan vowel marks (ḥiriq or sheva). You can get rid of about 600 cases of two vowels on a single letter that way. At the cost of introducing this "phantom yod" abstraction of course. But it might be a good trade-off.

Here's a cool feature that would pretty much just "fall out" of this representation for free: the option to show this ketiv/qere explicitly instead of implicitly. See, for example, the treatment of yerushalayim in the recent JPS commentary on Psalms 120-150, e.g.:

Ben, could you comment a little more on how you imagine that would work?

Some of my stray thoughts:

bdenckla commented 4 months ago

There may be no need for my "phantom yod" idea if there is a general feature that allows a pointed qere (.text) to sometimes be present, when it needs to differ from the pointed ketiv (.original). (Or, equivalently, .text is always present, but only sometimes differs from .original.)

If you had such a feature, I imagine it would apply to not only the 600 or so cases of Jerusalem-related words, but also perhaps to other tricky cases we've discussed elsewhere:

On the other hand, the generality of this .text/.original representation can be viewed as a weakness not a strength. The weakness is that it doesn't explicitly represent the difference between the ketiv and the qere. Of course, that difference can be automatically derived. But if a client of the API wants to highlight (literally or metaphorically) the difference between ketiv and qere, it would be convenient for the client to not have to derive the diff itself.

For instance, it might be cool to "call out" the phantom yod in transliteration by making whatever letter represents it (probably "y") gray-colored or something. Sort of the opposite of literal highlighting (backgrounding rather than foregrounding) but you get what I mean. Or it might be cool to make the Hebrew yod gray, although that's pretty hard to do since it is difficult to control the color of a letter independently from its diacritics.

bdenckla commented 4 months ago

Another thing you may want to consider is whether you want to provide some functionality to help with the (dreaded) superimposed representation of dually-cantillated words. In a way these words are distant cousins of the "implicit ketiv/qere" words I discussed above, if you "buy" the following analogy:

The words with superimposed cantillation include a few that are sort of the opposite of that weird haladonai Deut. 32:6 case I mentioned above: what looks like a single chanted word in the input becomes more than one chanted word in the individual outputs. E.g. what looks like a single chanted word, לֹֽ֣א־יִהְיֶ֥͏ֽה־לְךָ֛֩, becomes the following:

charlesLoder commented 4 months ago

Ok returning to this after going down a rabbit hole with the other issue.

There may be no need for my "phantom yod" idea if there is a general feature that allows a pointed qere (.text) to sometimes be present, when it needs to differ from the pointed ketiv (.original). (Or, equivalently, .text is always present, but only sometimes differs from .original.)

The difference between .original and .text is not in terms on ketiv/qere, but rather in how this package handles characters for syllabification.

Example:

const text = new Text("חָפְנִי֙");
console.log(text.text === text.original);
// false, because the `.text` has a qamets qatan character whereas the `.original` does not

For the Divine Name, I don't syllabify it so the .text and .original are the same, but I do have a .isDivineName prop for it.

I do want to hone in on how to handle the implicit ketiv/qeres for the Divine Name, Jerusalem, and hi'.

Perhaps a property that allows a user to pass in a ketiv and set a qere could be helpful.

Example:

const text = new Text("הִ֖וא בֵּֽית־אֵ֑ל ה֖וּא וְכׇל־הָעָ֥ם אֲשֶׁר־עִמּֽוֹ׃", {
    ketivQeres: [
        {
            input: "הִוא",
            output: "הִ֖וא",
            ignoreTaamim: true // idk about this settings
        }
    ]    
});

text.word[0].original // הִ֖וא
text.word[0].text // הִ֖וא

And maybe Jersulem (and its inflected variants) and hi' could be default ones.

I want the package to be flexible enough to handle everything, but I don't want to have to account for everything.

bdenckla commented 4 months ago

And maybe Jersulem (and its inflected variants) and hi' could be default ones.

Yes, it would be nice to have some defaults for the common cases at least. Particularly for these common cases, the k/q is best described in some compact, generalized form (like a regular expression) because there are an unwieldy number of cases to describe explicitly (e.g. over 600 cases of Jerusalem-related words).

charlesLoder commented 4 months ago

Good thought! I can basically recreate something like the ADDITIONAL_FEATURES from the transliteration package.

charlesLoder commented 4 months ago

See new issue, closing this