m_a declension -- pada - Githubissues

funderburkjim commented 6 years ago

There is one additional detail that plays a role in the m_a declension algorithm but that wasn't discussed in #5.

This involves the interpretation of the phrase When, in the same word, in Antoine's formulation of the nR sandhi.

Consider the example upariBAva ( the state of being higher or above).

When declining upariBAva, if we say that the 'r' is part of the base, then the 3s is upariBAveRa.

But upariBAva is a compound from upari and BAva; if we say that the declension algorithm should be applied to the final pada of the compound (BAva), and then completed by prefixing the non-ending padas, we woud prefix combine upari with the 3s BAvena of BAva, resulting in upariBAvena.

It is my understanding that in declensions of compounds, the second alternative is generally correct.

The current algorithm proceeds under this assumption.

funderburkjim commented 6 years ago

MW structure and final pada

We are applying the declensions to headwords found in the MW dictionary.
Generally compounds are indicated by special formatting in the dictionary:

funderburkjim commented 6 years ago

key2 and final pada

The digitization of MW separates the parts of compounds generally by using '-' character; this is part of the structure of the <key2> field. In our example, key2 is upari-BAva.

This key2 structure is used in the declension algorithm, in the manner described for upariBAva.

start declension algorithm using key2.
If there are no '-' characters in key2, proceed as described in #5.
If there are '-' characters in key2, represent key2 = head-X, where X is the last 'pada'
Apply the declension algorithm (as described in #5) to X (i.e. generate the base by dropping the 'a' from the end of X, concatenate the endings, and apply nR-sandhi).
Finish by reattaching the 'head' to each of the declined forms.

funderburkjim commented 6 years ago

a list of examples

A test version of the declension was developed which ignored the pada-structure of key2; i.e., it applied the declension algorithm by first removing all the '-' from key2.

These no-pada declensions were then compared to the with pada declensions.

Of the 47000+ m_a examples, 818 differences were found; these are cases where the nopada declensions result in a cerebralization of the 'n' in 3s and 6p, while the with-pada declensions showed cerebralization.

These cases like upari-BAva are in diff_nopada_m_a.txt.

funderburkjim commented 6 years ago

Doubts regarding key2 for pada detection

While I think that the declension of compounds should take into account the final pada in application of nR-sandhi, there remain doubts regarding the use of the key2 markup to identify the final pada.

There are possible problems of two types:

key2 implies a pada structure that should be ignored in declension. For instance, what about cases like 'pra-X' (pra-gama); for declension purposes, should this be treated as one pada (pragama)?
key2 misses a pada structure that should be applied in declension. I'm thinking about cases where two parts of a compound are joined by a vowel sandhi, and the first part has an 'r' and all the letters after that 'r' are allowed intervening letters for nR sandhi. Actual examples are hard to find, but I think 'grahAhvaya' (called after the demons) fits the bill. It is a compound of graha and Ahvaya ; but there is no '-' in the markup, so in our algorithm key2 is treated as a single pada, and the 3-s is grahAhvayeRa (i.e. nR sandhi is applied). But I suspect that, for the purposes of declension, we should treat this as grah-Ahvaya which would lead to grahAhvayena (no nR sandhi).

What do others think of this example.

If others think that it should be grahAhvayena (rather than grahAhvayeRa of current algorithm), perhaps we can devise some way to identify such cases, and insert an appropriate '-' to get the correct form.

gasyoun commented 6 years ago

It is my understanding that in declensions of compounds, the second alternative is generally correct.

@SergeA is it?

devise some way to identify such cases

Should it be hard? We sure see graha there and if humans can, why not AI?

@drdhaval2785 what about the grahAhvayena case?

drdhaval2785 commented 6 years ago

@funderburkjim Now you ventured deep into Paninian forest.

Let me put Paninian rules and their implication for this nR sandhi for you with examples from diff_nopada_m_a.txt wherever possible. Rules are 8.4.1 to 8.4.39

1. 8|4|1 | रषाभ्यां नो णः समानपदे | 235
2. 8|4|2 | अट्कुप्वाङ्नुम्व्यवायेऽपि | 197

The above two rules have been understood properly in above discussion. Only clarification - Why is akzara and muKa treated as two padas? Because Panini treats that the declined words join to form compound and in the process of compounding the suffixes are dropped. So those dropped suffixes serve to differentiate the words akzara and muKa as separate padas, and hence no nR change. There is another vArtika 'ऋवर्णाच् च इति वक्तव्यम्' in the 8.4.1. Therefore the triggerring letters are expanded from '[rz]' to '[rzfF]'.

drdhaval2785 commented 6 years ago

3.  8\|4\|3 | पूर्वपदात्‌ संज्ञायामगः

This specifies that in compounds too, there is nR change if there is [rzfF] in the first part, no 'g' intervening, and [n] in second part and the meaning is a Proper name. e.g. SUrpaRaKA is made from SUrpa-naKA. We don't have to bother because MW has split it like SUrpa-RaKA only. See 220353 SUrpaRaKA SUrpa-RaKA f in lexnorm-all2.txt.

drdhaval2785 commented 6 years ago

Rest 36 sutras can be mentioned here if you feel it is worth venturing. I personally feel it would be too much of labour for not so much gain.

drdhaval2785 commented 6 years ago

There is one more item which has a significant bearing. https://sanskritdocuments.org/learning_tools/ashtadhyayi/vyakhya/8/8.4.12.htm This rule mandates that if the second part of the compound has only a single vowel, then in case of derivation / conjugation, there would be nR change even if the [rfF] is in first part of compound and [n] in the second part of compound (ena, AnAm etc)

[-][^aAiIuUfFxXeEoOn]*[aAiIuUfFxXeEoO][^aAiIuUfFxXeEoOn]*$ in diff_nopada_m_a yielded 69 such cases.

e.g. agra-gena -> agrageRa, madra-pa -> madrapeRa etc.

    Line 9: a-mara-pa
    Line 12: a-ri-ha
    Line 15: aMhri-pa
    Line 24: agra-ga
    Line 27: aNGri-pa
    Line 28: ati-BAra-ga
    Line 42: antarikza-ga
    Line 43: anya-strI-ga
    Line 48: aBra-ga
    Line 52: asfk-pa
    Line 53: asra-pa
    Line 74: izu-pa
    Line 85: ura-ga
    Line 86: uraM-ga
    Line 91: Uzma-pa
    Line 94: kakza-pa
    Line 99: kari-pa
    Line 122: kzatra-pa
    Line 129: kzIra-pa
    Line 131: kzudra-Ba
    Line 133: kzetra-pa
    Line 138: kzmA-pa
    Line 139: Kara-pa
    Line 151: guru-Ba
    Line 153: gfha-pa
    Line 166: Garma-ga
    Line 177: candra-Ba
    Line 198: Cattra-pa
    Line 216: tirya-ga
    Line 219: tura-M-ga
    Line 222: tura-ga
    Line 244: dASArha-ka
    Line 267: devAri-pa
    Line 272: dru-Ga
    Line 274: dvAra-pa
    Line 299: nakzatra-pa
    Line 306: nara-pa
    Line 320: niKurya-pa
    Line 328: nir-ga
    Line 346: nf-ga
    Line 347: nf-pa
    Line 379: pari-Ga
    Line 449: pra-kzepa-ka
    Line 477: prati-hAra-pa
    Line 482: prayo-ga
    Line 486: prava-M-ga
    Line 488: prava-ga
    Line 536: BAra-ga
    Line 551: madra-pa
    Line 564: mahizI-pa
    Line 572: mAra-pa
    Line 573: mArga-pa
    Line 581: mudrA-Nka
    Line 589: yoga-pAraM-ga
    Line 612: rUpa-pa
    Line 616: roha-ga
    Line 628: varza-kftya-taraM-ga
    Line 629: varza-pa
    Line 631: vastra-pa
    Line 658: vfza-ga
    Line 699: SIGra-ga
    Line 720: Slezma-ha
    Line 735: sam-udra-ga
    Line 740: sarva-ga
    Line 750: sarvatra-ga
    Line 773: surA-pa
    Line 786: senA-gra-ga
    Line 794: strI-muKa-pa
    Line 817: haridrA-Ba

gasyoun commented 6 years ago

[-][^aAiIuUfFxXeEoOn][aAiIuUfFxXeEoO][^aAiIuUfFxXeEoOn]$ in diff_nopada_m_a yielded 69 such cases.

Dhaval back home again. Feels so good.

funderburkjim commented 6 years ago

agra-ga

Very interesting. As I understand it, these cases of last pada with single vowel are currently incorrectly declined. according to the 8.4.12 rule.

As you correctly understood, the current system of declination uses the pada structure based on '-' in key2. This pada structure is located in a copy of lexnorm-all2. Since we are using this copy for purposes of inflection, we are not bound to maintain the '-' pada structure implied by MW's levels. I.E., we can change lexnorm-all2 for the examples you listed above. E.g.,

old 1216    agraga  agra-ga m
new 1216    agraga  agraga  m

Then, using agraga, nR sandhi will come into play:

Declension of m_a agraga
Case 1:  agragaH agragO agragAH
Case 2:  agragam agragO agragAn
Case 3:  *agrageRa* agragAByAm agragEH
Case 4:  agragAya agragAByAm agrageByaH
Case 5:  agragAt agragAByAm agrageByaH
Case 6:  agragasya agragayoH *agragARAm*
Case 7:  agrage agragayoH agragezu
Case 8:  agraga agragO agragAH

If you confirm this, I'll go ahead and implement. for the list above for m_a.

Then, similar lists can be generated for nominals with other endings.

drdhaval2785 commented 6 years ago

Your explanation and execution is OK.

I am yet a bit confused by some of the examples.

Line 628: varza-kftya-taraM-ga

I am sure that the declention of taraMga is taraMgeRa. So there seems some problem with M- combiations preceding the last hyphen. Reason evades me, though. Look at tura-M-ga too.

sanskrit-lexicon / MWinflect

m_a declension -- pada #6

MW structure and final pada

key2 and final pada

a list of examples

Doubts regarding key2 for pada detection