Closed charlesLoder closed 4 months ago
Responding here rather than on Twitter/X. To investigate your questions about "accent multiplicity," I would suggest doing the investigation in the following order, up the different levels:
(Atoms differ only very slightly from maqaf compounds with respect to these matters. One small difference is that some two-part accents can spread across the atoms of a compound, but most are restricted to a single atom.)
Summarizing your responses and the responses of others so far, with regard to letters, here are some cases of multiple accents:
Adding some of my responses (not claiming that this is exhaustive):
If you consider gaʿya (meteg) an accent (debatable):
Thanks! I'll have to ruminate on this.
As for the api, I did a pretty big refactor of the Char
object to make these higher level api's a little simpler.
For the Word
, Syllable
, and Cluster
objects, I may refactor the api like this:
Current
.vowel
returns the vowel character.vowelName
returns the partial Unicode name of vowel character.hasVowelName
returns a boolean if the vowel name is in the objectNew
.vowels
returns an array of vowel characters (not really something that would happen, but doing it for consistency below).vowelNames
returns an array of partial Unicode names of vowel characters.taam
returns the first taam character.taamName
returns the partial Unicode name of the first taam character.hasTaamName
return a boolean is the taam name is in the object.taamim
(add alias of .taams
to keep English plural consistency) returns an array of taam characters.taamNames
returns an array of the partial Unicode names of the taam charactersMy thought is the the singular apis would be what most people anticipate (one vowel, one taam) and it would be backwards compatible. The plural apis would allow for more precision.
Would that api make sense to you?
I still want to think about how to handle the meteg/gaya. I could give the meteg its own api — .hasMeteg
, etc. Ditto for the masora circle. They're not accents, but they operate in a liminal space
I didn't add Masora circle to the taamim:
But I also don't have the meteg character anywhere. So gotta add that!
For units larger than the cluster (syllable and word), when you are returning multiple results (e.g. multiple taamim), can these results be easily related back to the cluster they belong to, or are they just a list? I would imagine wanting a sparse array of some sort. E.g. for a three-cluster syllable with taamim a and b on clusters 1 and 3 respectively, I would imagine wanting the answer to the question "what are the taamim on this syllable" to be something like [a, null, b].
Is Word
what I call an atom?
Is Word what I call an atom?
I think so. Something like "וְכׇל־הָעָם֩" is 2 Words
For units larger than the cluster (syllable and word), when you are returning multiple results (e.g. multiple taamim), can these results be easily related back to the cluster they belong to, or are they just a list? I would imagine wanting a sparse array of some sort. E.g. for a three-cluster syllable with taamim a and b on clusters 1 and 3 respectively, I would imagine wanting the answer to the question "what are the taamim on this syllable" to be something like [a, null, b].
Not exactly, but you could drill down into them.
Example:
const text = new Text("י֥וֹם֩"); // Deut 5:12
const word = text.words[0]; // .words is an array of `Words`
const syllable = word.syllables[0];
syllable.taamim
// ["MERKHA", "TELISHA_QETANA"]
syllable.clusters.map(c => c.taamim);
// [["MERKHA"], [null], ["TELISHA_QETANA"]]
The results would be strings.
Maybe a verbose property or method would be good:
syllable.taamimVerbose
// [ { taam:"MERKHA", cluster: <POINTER> }, { taam:"TELISHA_QETANA", cluster: <POINTER> } ]
I see. Indeed, I now see how there's no need for a sparse output at the syllable (or presumably word) level since it can so easily be generated as you suggest:
syllable.clusters.map(c => c.taamim);
I'm not sure I see the need for your suggested taamimVerbose
but who knows. Hard to imagine what applications might need or want in advance.
.vowels
returns an array of vowel characters (not really something that would happen, but doing it for consistency below)
BTW in editions with superimposed cantillation of the Decalogues, there are two words in each of the two Decalogues (for a total of four words) for which there are not only two accents but also two vowels.
Also, the implicit ketiv/qere for yerushalayim and yerushalaymah is usually encoded with two vowels on the lamed. The lamed has its expected "a" vowel (qamats or pataḥ) as well as adopting the orphan ḥiriq or sheva.
BTW in editions with superimposed cantillation of the Decalogues, there are two words in each of the two Decalogues (for a total of four words) for which there are not only two accents but also two vowels.
I'm learning something new everyday!
Regarding yerushalayim and yerushalaymah, you have the opportunity to do something I never had the guts to do, which is to introduce the notion of a "phantom yod" to hold those orphan vowel marks (ḥiriq or sheva). You can get rid of about 600 cases of two vowels on a single letter that way. At the cost of introducing this "phantom yod" abstraction of course. But it might be a good trade-off.
Here's a cool feature that would pretty much just "fall out" of this representation for free: the option to show this ketiv/qere explicitly instead of implicitly. See, for example, the treatment of yerushalayim in the recent JPS commentary on Psalms 120-150, e.g.:
The MAM dataset sort of encodes this "phantom yod" idea via its מ:ירושלם template.
I made a list on two tammim on a letter. It's more inclusive to the Letteris edition but might be helpful. https://benemanuel.geulah.org.il/two-is-not-one-%D7%91-%D7%98%D7%A2%D7%9E%D7%99%D7%9D-%D7%91%D7%9E%D7%99%D7%9C%D7%94-%D7%90%D7%97%D7%93
Thanks @benemanuel for bringing the Ezekiel 20:31 one to my attention. I have it in my list (not published) but overlooked it. MAM has some documentation about it:
Thanks all! This is helpful data, especially for testing
BTW in editions with superimposed cantillation of the Decalogues, there are two words in each of the two Decalogues (for a total of four words) for which there are not only two accents but also two vowels.
@bdenckla Are there digital editions with this? I couldn't notice any word with two vowels
BTW in editions with superimposed cantillation of the Decalogues, there are two words in each of the two Decalogues (for a total of four words) for which there are not only two accents but also two vowels.
@bdenckla Are there digital editions with this? I couldn't notice any word with two vowels
MAM Exo 20:2 & Deut 5:6 עַל־פָּנָֽ͏ַ֗י MAM Exo 20:3 & Deut 5:7 מִתָּ֑͏ַ֜חַת
UXLC has 3 of those 4 but its verse numbering is one different:
Exo 20:3 & Deut 5:7 פָּנָֽ͏ַ֗י Exo 20:4 (no corresponding Deut.!) מִתָּ֑͏ַ֜חַת
Thanks! This is some helpful data.
My ideas for this have started to spiral out a bit, but I like it. Here's an example on Cluster
s:
const clusters = new Text("מִתָּ֑͏ַ֜חַת").clusters;
console.log(
clusters.map((c) => {
return {
text: c.text,
consonant: c.consonant,
consonants: c.consonants,
consonantName: c.consonantName,
consonantNames: c.consonantNames,
taam: c.taam,
taamim: c.taamim,
taamName: c.taamName,
taamimNames: c.taamimNames,
vowel: c.vowel,
vowels: c.vowels,
vowelName: c.vowelName,
vowelNames: c.vowelNames
};
})
);
Results:
[
{
text: 'מִ',
consonant: 'מ',
consonants: [ 'מ' ],
consonantName: 'MEM',
consonantNames: [ 'MEM' ],
taam: null,
taamim: [ null ],
taamName: null,
taamimNames: [ null ],
vowel: 'ִ',
vowels: [ 'ִ' ],
vowelName: 'HIRIQ',
vowelNames: [ 'HIRIQ' ]
},
{
text: 'תַָּ֑֜͏',
consonant: 'ת',
consonants: [ 'ת' ],
consonantName: 'TAV',
consonantNames: [ 'TAV' ],
taam: '֑',
taamim: [ '֑', '֜' ],
taamName: 'ETNAHTA',
taamimNames: [ 'ETNAHTA', 'GERESH' ], // note two taamim
vowel: 'ָ',
vowels: [ 'ָ', 'ַ' ],
vowelName: 'QAMATS',
vowelNames: [ 'QAMATS', 'PATAH' ] // note two vowels
},
{
text: 'חַ',
consonant: 'ח',
consonants: [ 'ח' ],
consonantName: 'HET',
consonantNames: [ 'HET' ],
taam: null,
taamim: [ null ],
taamName: null,
taamimNames: [ null ],
vowel: 'ַ',
vowels: [ 'ַ' ],
vowelName: 'PATAH',
vowelNames: [ 'PATAH' ]
},
{
text: 'ת',
consonant: 'ת',
consonants: [ 'ת' ],
consonantName: 'TAV',
consonantNames: [ 'TAV' ],
taam: null,
taamim: [ null ],
taamName: null,
taamimNames: [ null ],
vowel: null,
vowels: [ null ],
vowelName: null,
vowelNames: [ null ]
}
]
This is cool to be able to handle these extraordinary words, but I think it would also be fine to just return an error saying that such words are not supported. Or return the result for the first of the two vowels along with a warning.
Remember that these words' very existence is an artifact of a particular (unfriendly) choice of representation used in the great manuscripts. These words do not appear in this unfriendly form in publications intended to be read aloud or chanted. Thus, arguably, this unfriendly form need not be supported by phonetic transcription software. But maybe you're aiming for a general representation here independent of the application of phonetic transcription.
Any thoughts on the representation of the (far more common and more important) dual vowels in yerushalayim and yerushalaymah?
If all you're trying to do is represent the Unicode in a structured form, then I guess yerushalayim and the "QUPO" words can be handled the same. (I call the dual-vowel Decalogue words "QUPO" words because they both consist of qamats, an under-accent, pataḥ, and an over-accent.)
But more deeply, i.e. semantically, the reason for the dual vowel in yerushalayim is very different than the reason for the dual vowel in a QUPO word.
Ben, please note that the writers of these original handwritten manuscripts did not think the same. The hypnosis that two tammim have no place on one word and can not be read together, is just that, a hypnosis. There also exists theory's that use ALL tammim ALWAYS.
Avi
On Mon, Apr 15, 2024, 17:48 Ben Denckla @.***> wrote:
This is cool to be able to handle these extraordinary words, but I think it would also be fine to just return an error saying that such words are not supported. Or return the result for the first of the two vowels along with a warning.
Remember that these words' very existence is an artifact of a particular (unfriendly) choice of representation used in the great manuscripts. These words do not appear in this unfriendly form in publications intended to be read aloud or chanted. Thus, arguably, this unfriendly form need not be supported by phonetic transcription software. But maybe you're aiming for a general representation here independent of the application of phonetic transcription.
Any thoughts on the representation of the (far more common and more important) dual vowels in yerushalayim and yerushalaymah?
If all you're trying to do is represent the Unicode in a structured form, then I guess yerushalayim and the "QUPO" words can be handled the same. (I call the dual-vowel Decalogue words "QUPO" words because they both consist of qamats, an under-accent, pataḥ, and an over-accent.)
But more deeply, i.e. semantically, the reason for the dual vowel in yerushalayim is very different than the reason for the dual vowel in a QUPO word.
— Reply to this email directly, view it on GitHub https://github.com/charlesLoder/havarotjs/issues/158#issuecomment-2057045583, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTOLNZRO4GM4IMWTQWLEP3Y5PSFPAVCNFSM6AAAAABD7B6XK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJXGA2DKNJYGM . You are receiving this because you were mentioned.Message ID: @.***>
The api in this issue is really just about being able to query the text and describe it well. In order to reduce user friction, I prefer not to error on things, or at least provide an escape hatch. See this issue as an example — even though the sheva in texts derived from L is clearly meant to be a gaya (like in MAM), I prefer to handle it in some way, even if it's not "correct."
This module is meant to be lower level so no matter how unfriendly the text is, it should work in some way if possible. In the transliteration package, a user can decide how they want to handle output, but in this package, I generally just want to make apis available to be used elsewhere.
Any thoughts on the representation of the (far more common and more important) dual vowels in yerushalayim and yerushalaymah?
I'll make a separate issue for that. I want to mull over it a bit to understand it.
Ben, please note that the writers of these original handwritten manuscripts did not think the same. The hypnosis that two tammim have no place on one word and can not be read together, is just that, a hypnosis. There also exists theory's that use ALL tammim ALWAYS.
I think you mean something other than "hypnosis". Maybe you mean something like "fantasy" or "unfounded belief"?
Anyway, we've discussed a lot of different topics in the comments of this issue but the most recent topic was two VOWELS on the same LETTER whereas you are making (IMO somewhat wild) claims about two ACCENTS on the same WORD. These are of course very different topics.
There are many reasons for two accents on the same word, i.e. many different underlying phenomena result in the same superficial (typographic) artifact of two accents on the same word.
I can say with some confidence (and can cite many, many authorities) that there is no tradition in which the two cantillations of the Decalogues represent anything other than an exclusive CHOICE: chant one or chant the other, but not both. Can you cite any authority that suggests that, to the contrary, the two cantillations of the Decalogues represent a possible tradition of simultaneous performance? (By "simultaneous performance" I mean somehow chanting both "at the same time," whatever that would mean.) Are you suggesting the existence of such a tradition?
Not traditional but check out Suzanne Haik-Vantoura music of the bible revealed.
On Mon, Apr 15, 2024, 19:09 Ben Denckla @.***> wrote:
Ben, please note that the writers of these original handwritten manuscripts did not think the same. The hypnosis that two tammim have no place on one word and can not be read together, is just that, a hypnosis. There also exists theory's that use ALL tammim ALWAYS.
I think you mean something other than "hypnosis". Maybe you mean something like "fantasy" or "unfounded belief"?
Anyway, we've discussed a lot of different topics in the comments of this issue but the most recent topic was two VOWELS on the same LETTER whereas you are making (IMO somewhat wild) claims about two ACCENTS on the same WORD. These are of course very different topics.
There are many reasons for two ACCENTS on the same word, i.e. many different underlying phenomena result in the same superficial (typographic) artifact of two ACCENTS on the same word.
I can say with some confidence (and can cite many, many authorities) that there is no tradition in which the two cantillations of the Decalogues represent anything other than an exclusive CHOICE: chant one or chant the other, but not both. Can you cite any authority that suggests that, to the contrary, the two cantillations of the Decalogues represent a possible tradition of simultaneous performance? (By "simultaneous performance" I mean somehow chanting both "at the same time," whatever that would mean.) Are you suggesting the existence of such a tradition?
— Reply to this email directly, view it on GitHub https://github.com/charlesLoder/havarotjs/issues/158#issuecomment-2057224556, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACTOLN4UFR23WU56ADJKXKDY5P3VLAVCNFSM6AAAAABD7B6XK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJXGIZDINJVGY . You are receiving this because you were mentioned.Message ID: @.***>
The api around taamim should be similar to vowels