Open funderburkjim opened 6 years ago
We are applying the declensions to headwords found in the MW dictionary.
Generally compounds are indicated by special formatting in the dictionary:
The digitization of MW separates the parts of compounds generally by using '-' character; this is part
of the structure of the <key2>
field. In our example, key2 is upari-BAva
.
This key2 structure is used in the declension algorithm, in the manner described for upariBAva.
A test version of the declension was developed which ignored the pada-structure of key2; i.e., it applied the declension algorithm by first removing all the '-' from key2.
These no-pada declensions were then compared to the with pada declensions.
Of the 47000+ m_a examples, 818 differences were found; these are cases where the nopada declensions result in a cerebralization of the 'n' in 3s and 6p, while the with-pada declensions showed cerebralization.
These cases like upari-BAva are in diff_nopada_m_a.txt.
While I think that the declension of compounds should take into account the final pada in application of nR-sandhi, there remain doubts regarding the use of the key2 markup to identify the final pada.
There are possible problems of two types:
key2 misses a pada structure that should be applied in declension. I'm thinking about cases
where two parts of a compound are joined by a vowel sandhi, and the first part has an 'r' and
all the letters after that 'r' are allowed intervening letters for nR sandhi. Actual examples are
hard to find, but I think 'grahAhvaya' (called after the demons) fits the bill. It is a compound of
graha
and Ahvaya
; but there is no '-' in the markup, so in our algorithm key2 is treated as
a single pada, and the 3-s is grahAhvayeRa
(i.e. nR sandhi is applied).
But I suspect that, for the purposes of declension, we should treat this as grah-Ahvaya
which
would lead to grahAhvayena
(no nR sandhi).
What do others think of this example.
If others think that it should be grahAhvayena
(rather than grahAhvayeRa
of current algorithm),
perhaps we can devise some way to identify such cases, and insert an appropriate '-' to get the
correct form.
It is my understanding that in declensions of compounds, the second alternative is generally correct.
@SergeA is it?
devise some way to identify such cases
Should it be hard? We sure see graha there and if humans can, why not AI?
@drdhaval2785 what about the grahAhvayena case?
@funderburkjim Now you ventured deep into Paninian forest.
Let me put Paninian rules and their implication for this nR sandhi for you with examples from diff_nopada_m_a.txt wherever possible. Rules are 8.4.1 to 8.4.39
1. 8|4|1 | रषाभ्यां नो णः समानपदे | 235
2. 8|4|2 | अट्कुप्वाङ्नुम्व्यवायेऽपि | 197
The above two rules have been understood properly in above discussion. Only clarification - Why is akzara and muKa treated as two padas? Because Panini treats that the declined words join to form compound and in the process of compounding the suffixes are dropped. So those dropped suffixes serve to differentiate the words akzara and muKa as separate padas, and hence no nR change. There is another vArtika 'ऋवर्णाच् च इति वक्तव्यम्' in the 8.4.1. Therefore the triggerring letters are expanded from '[rz]' to '[rzfF]'.
3. 8\|4\|3 | पूर्वपदात् संज्ञायामगः
This specifies that in compounds too, there is nR change if there is [rzfF] in the first part, no 'g' intervening, and [n] in second part and the meaning is a Proper name.
e.g. SUrpaRaKA is made from SUrpa-naKA. We don't have to bother because MW has split it like SUrpa-RaKA only. See 220353 SUrpaRaKA SUrpa-RaKA f
in lexnorm-all2.txt.
Rest 36 sutras can be mentioned here if you feel it is worth venturing. I personally feel it would be too much of labour for not so much gain.
There is one more item which has a significant bearing. https://sanskritdocuments.org/learning_tools/ashtadhyayi/vyakhya/8/8.4.12.htm This rule mandates that if the second part of the compound has only a single vowel, then in case of derivation / conjugation, there would be nR change even if the [rfF] is in first part of compound and [n] in the second part of compound (ena, AnAm etc)
[-][^aAiIuUfFxXeEoOn]*[aAiIuUfFxXeEoO][^aAiIuUfFxXeEoOn]*$
in diff_nopada_m_a yielded 69 such cases.
e.g. agra-gena -> agrageRa, madra-pa -> madrapeRa etc.
Line 9: a-mara-pa
Line 12: a-ri-ha
Line 15: aMhri-pa
Line 24: agra-ga
Line 27: aNGri-pa
Line 28: ati-BAra-ga
Line 42: antarikza-ga
Line 43: anya-strI-ga
Line 48: aBra-ga
Line 52: asfk-pa
Line 53: asra-pa
Line 74: izu-pa
Line 85: ura-ga
Line 86: uraM-ga
Line 91: Uzma-pa
Line 94: kakza-pa
Line 99: kari-pa
Line 122: kzatra-pa
Line 129: kzIra-pa
Line 131: kzudra-Ba
Line 133: kzetra-pa
Line 138: kzmA-pa
Line 139: Kara-pa
Line 151: guru-Ba
Line 153: gfha-pa
Line 166: Garma-ga
Line 177: candra-Ba
Line 198: Cattra-pa
Line 216: tirya-ga
Line 219: tura-M-ga
Line 222: tura-ga
Line 244: dASArha-ka
Line 267: devAri-pa
Line 272: dru-Ga
Line 274: dvAra-pa
Line 299: nakzatra-pa
Line 306: nara-pa
Line 320: niKurya-pa
Line 328: nir-ga
Line 346: nf-ga
Line 347: nf-pa
Line 379: pari-Ga
Line 449: pra-kzepa-ka
Line 477: prati-hAra-pa
Line 482: prayo-ga
Line 486: prava-M-ga
Line 488: prava-ga
Line 536: BAra-ga
Line 551: madra-pa
Line 564: mahizI-pa
Line 572: mAra-pa
Line 573: mArga-pa
Line 581: mudrA-Nka
Line 589: yoga-pAraM-ga
Line 612: rUpa-pa
Line 616: roha-ga
Line 628: varza-kftya-taraM-ga
Line 629: varza-pa
Line 631: vastra-pa
Line 658: vfza-ga
Line 699: SIGra-ga
Line 720: Slezma-ha
Line 735: sam-udra-ga
Line 740: sarva-ga
Line 750: sarvatra-ga
Line 773: surA-pa
Line 786: senA-gra-ga
Line 794: strI-muKa-pa
Line 817: haridrA-Ba
[-][^aAiIuUfFxXeEoOn][aAiIuUfFxXeEoO][^aAiIuUfFxXeEoOn]$ in diff_nopada_m_a yielded 69 such cases.
Dhaval back home again. Feels so good.
agra-ga
Very interesting. As I understand it, these cases of last pada with single vowel are currently incorrectly declined. according to the 8.4.12 rule.
As you correctly understood, the current system of declination uses the pada structure based on '-' in key2. This pada structure is located in a copy of lexnorm-all2. Since we are using this copy for purposes of inflection, we are not bound to maintain the '-' pada structure implied by MW's levels. I.E., we can change lexnorm-all2 for the examples you listed above. E.g.,
old 1216 agraga agra-ga m
new 1216 agraga agraga m
Then, using agraga, nR sandhi will come into play:
Declension of m_a agraga
Case 1: agragaH agragO agragAH
Case 2: agragam agragO agragAn
Case 3: *agrageRa* agragAByAm agragEH
Case 4: agragAya agragAByAm agrageByaH
Case 5: agragAt agragAByAm agrageByaH
Case 6: agragasya agragayoH *agragARAm*
Case 7: agrage agragayoH agragezu
Case 8: agraga agragO agragAH
If you confirm this, I'll go ahead and implement. for the list above for m_a.
Then, similar lists can be generated for nominals with other endings.
Your explanation and execution is OK.
I am yet a bit confused by some of the examples.
Line 628: varza-kftya-taraM-ga
I am sure that the declention of taraMga is taraMgeRa.
So there seems some problem with M-
combiations preceding the last hyphen. Reason evades me, though.
Look at tura-M-ga
too.
There is one additional detail that plays a role in the m_a declension algorithm but that wasn't discussed in #5.
This involves the interpretation of the phrase
When, in the same word,
in Antoine's formulation of the nR sandhi.Consider the example
upariBAva
( the state of being higher or above).When declining
upariBAva
, if we say that the 'r' is part of the base, then the 3s isupariBAveRa
.But
upariBAva
is a compound fromupari
andBAva
; if we say that the declension algorithm should be applied to the final pada of the compound (BAva), and then completed by prefixing the non-ending padas, we woud prefix combineupari
with the 3sBAvena
ofBAva
, resulting inupariBAvena
.It is my understanding that in declensions of compounds, the second alternative is generally correct.
The current algorithm proceeds under this assumption.