Documentation of wilmwroots reason codes.

funderburkjim commented 8 years ago

The end result of the correspondence between Wilson and Monier-Williams roots is currently represented in the file wil_mw.txt in the wilmwroots/step2 directory.

Each line of the file represents a correspondence between the Wilson and MW spellings of one root. The spellings are in the SLP1 transliteration. Here is the first line as a sample:

<c>SPa-ROOT</c> <wil>aMSa</wil> <mw>aMS</mw>

The content of the <c> element is a reason code.

The purpose of this issue is to document the meaning of these reason codes. The meanings are paraphrasings of the 'real' meaning which is expressed in the program wil_mw.py.

funderburkjim commented 8 years ago

The wil_mw.py program prints a tabulation of the reason codes:

Tabulation of reasons for matching
SPa-ROOT-rdD 1
saN-SPb-ROOT 1
SPa-ROOT-nasal 98
SPa-ROOT-sj 3
Probable 15
ROOT 191
z-ROOT 13
R-SPa-ROOT 24
Intensive 1
Likely 25
SPa-ROOT 1212
saN-SPa-ROOT 1
SPc-ROOT-nasal 3
z-SPa-ROOT 62
SPc-ROOT 14
None 35
SPb-ROOT 24
SPa-ROOT-r 23
Causal 6
SPb-ROOT-nasal 1
R-ROOT 3

There are 21 of these codes which need explaining.

funderburkjim commented 8 years ago

None (35) This is easiest to explain. These Wilson roots have NO correspondence, thus far, to an MW root.

The underlying records of the Wilson digitization for these cases are in the wil_mw_prob.txt file.

funderburkjim commented 8 years ago

ROOT (191) is the simplest match. In this case, the Wilson spelling of the root is identical to the MW spelling. Example: <c>ROOT</c> <wil>uDras</wil> <mw>uDras</mw>

SPa-ROOT (1212) is the most common reason, by far. In this case (see the aMSa example above), to get the MW root spelling, one must drop the final 'a' from the Wilson root spelling.

SPb-ROOT (24) The MW spelling is obtained by adding 'ya' to the Wilson spelling. Example: <c>SPb-ROOT</c> <wil>Una</wil> <mw>Unaya</mw>. At least in this instance, MW classifies the root Unaya as a denominative (from noun Una).

SPc-ROOT (14) In addition to dropping the 'a' anubandha from the Wilson spelling, the MW spelling also replaces Wilsons 'cC' with 'C'. Example: <c>SPc-ROOT</c> <wil>jarcCa</wil> <mw>jarC</mw>

gasyoun commented 8 years ago

Probable 15 Likely 25

What's the difference?

funderburkjim commented 8 years ago

SPa-ROOT-r (23) Wilson spellings generally double a consonant after 'r', such as karrman. In this category, the MW spelling is obtained by (a) dropping the anubandha a, and (b) undoubling a consonant after 'r'. Example <c>SPa-ROOT-r</c> <wil>arbba</wil> <mw>arb</mw>

SPa-ROOT-rdD (1) The only case is <c>SPa-ROOT-rdD</c> <wil>spardDa</wil> <mw>sparD</mw>. This is conceptually similar to the SPa-ROOT-r case, except that in the consonant doubling of the aspirated consonant D, the doubling letter loses aspiration.

SPa-ROOT-sj (3) In addition to dropping the anubandha a, an sj in Wilson spelling becomes jj in MW. Example: <c>SPa-ROOT-sj</c> <wil>Brasja</wil> <mw>Brajj</mw>

funderburkjim commented 8 years ago

Several spellings in Wilson begin with the retroflex sibilant 'z', which becomes 's' in MW. In addition, some other allied changes may occur ( zw -> st, zW -> sT , zR -> sn).

z-ROOT (13) Here, after the z/s change, the Wilson spelling is identical to MW spelling. Example: <c>z-ROOT</c> <wil>zWA</wil> <mw>sTA</mw> z-SPa-ROOT (62) Here, in addition to the z/s change, the anubandha a is dropped to get MW spelling. Example: <c>z-SPa-ROOT</c> <wil>zada</wil> <mw>sad</mw>. Example2: <c>z-SPa-ROOT</c> <wil>zvartta</wil> <mw>svart</mw> Also rtt -> rt

funderburkjim commented 8 years ago

Wilson doesn't normally have separate entries for prefixed verbs. However, in two case with prefix sam, the Wilson spelling uses the homorganic nasal ('N') before a retroflex consonant, while MW uses anusvAra 'M'. saN-SPa-ROOT (1) Example: <c>saN-SPa-ROOT</c> <wil>saNgrAma</wil> <mw>saMgrAm</mw>

saN-SPb-ROOT (1) Example: <c>saN-SPb-ROOT</c> <wil>saNketa</wil> <mw>saMketaya</mw>

funderburkjim commented 8 years ago

Several Wilson root spellings begin with the retroflex nasal R, while MW uses the dental nasal n.

R-ROOT (3) Example: <c>R-ROOT</c> <wil>RI</wil> <mw>nI</mw>

R-SPa-ROOT (24) Example: c>R-SPa-ROOT</c> <wil>RaSa</wil> <mw>naS</mw> . In these, the Wilson spelling also has the a anubandha.

funderburkjim commented 8 years ago

For several roots, MW spelling has a penultimate nasal which is not present in WIL spelling. In some (all?) cases, Wilson also shows an anubandha 'i', although in our derivation of the headword spelling this 'i' does not appear.

SPa-ROOT-nasal (98). To get the MW spelling, the ending a anubandha is dropped and a nasal (homorganic to the final consonant) is inserted. Example: <c>SPa-ROOT-nasal</c> <wil>tvaga</wil> <mw>tvaNg</mw>

SPb-ROOT-nasal (1) Example: <c>SPb-ROOT-nasal</c> <wil>tatra</wil> <mw>tantraya</mw>

SPc-ROOT-nasal (3) Example: <c>SPc-ROOT-nasal</c> <wil>lAcCa</wil> <mw>lAYC</mw>

funderburkjim commented 8 years ago

Wilson has several root entries, which he describes as class 10 roots, which correspond to the Causal form of other roots.

Causal (6). Example: <c>Causal</c> <wil>jYapa</wil> <mw>jYA</mw>

Similarly, there is one Wilson root which corresponds to the intensive stem of an MW root:

Intensive (1). Example: <c>Intensive</c> <wil>daridrA</wil> <mw>drA</mw>

funderburkjim commented 8 years ago

Finally, there are two lists of correspondences for which no formal correspondence scheme has been identified. These no doubt require further examination and explanation.

Likely (25). Example <c>Likely</c> <wil>uYJa</wil> <mw>ujJ</mw>

Probable (15). Example: <c>Probable</c> <wil>kfba</wil> <mw>kfv</mw>

As of this writing, the difference between the two categories is not clear. Possibly, I thought of the correspondences of the Likely class as more certain than those of the Probable class.

funderburkjim commented 8 years ago

This analysis does not take into account the fact that some roots appear multiple times in each dictionary. At some point, a more refined correspondence using 'L-numbers' will be appropriate.

However, before dealing with this refinement, it seems appropriate to develop a system for comparing the entries of Wilson and MW which this current work alleges to be correspondent.

drdhaval2785 commented 8 years ago

Intensive (1). Example: Intensive daridrA drA

drA and daridrA are separate verbs. It is one of those rare >1 vowel verbs in Sanskrit.

gasyoun commented 8 years ago

Wilson also shows an anubandha 'i', although in our derivation of the headword spelling this 'i' does not appear.

Do you know why?

However, before dealing with this refinement, it seems appropriate to develop a system for comparing the entries of Wilson and MW which this current work alleges to be correspondent.

As your classification is very detailed, I'm thinking of some universal rules and adding numbers before them, to show hierarchy.

Like

1.0 SPa-ROOT-r (23)
1.1 SPa-ROOT-rdD (1)

drA and daridrA are separate verbs.

That's vyakarana point of view. From the history of Indo-European languages it's not 2 verbs, but 1 verb. It depends on the point of view. I'm on Jim's side.

funderburkjim commented 8 years ago

drA and daridrA are separate verbs.

Excellent. This is just the kind of erroneous or questionable matching that needs to be identified.

@gasyoun re From the history of Indo-European languages it's not 2 verbs, but 1 verb., could you elaborate.

Incidentally, there is also <c>ROOT</c> <wil>drA</wil> <mw>drA</mw>

Why is there no verb daridrA in MW. Is it as a derived form of some other root?

I haven't yet thought about a format for entering corrections and/or question regarding these correspondences. As a preliminary step, I've made [wil_mw_questions.txt[(https://github.com/sanskrit-lexicon/WIL/blob/master/wilmwroots/step2/wil_mw_questions.txt) and entered daridrA.

funderburkjim commented 8 years ago

re <c>SPa-ROOT-nasal</c> <wil>tvaga</wil> <mw>tvaNg</mw>,
Why the digitization spelling of the root is tvaga, when there is an 'i' anubandha. ?

The technical explanation is as follows: scan:

Corresponding digitization in wil.txt

.{#tvaga#}¦ ({#i#}) {#tvagi#} r. 1st cl. ({#tvaMgati#}) To go, to move.

The derivation of headword from this line just uses .{#tvaga#}¦, So that's where 'tvaga' comes from.

In other words, we show 'tvaga' because that appears to be what Wilson has.

The real explanation would be to understand why WIlson wrote his entry as he did. No doubt he was working with some DAtupAWa, but do we know which one?

In looking at Normalized root 'tvag' in the mADavIya version, I do not see support for a 'tvaga' spelling; so it would seem more logical for Wilson to have written his headword as 'tvagi'.

gasyoun commented 8 years ago

Why is there no verb daridrA in MW. Is it as a derived form of some other root?

Exactly! daridrA comes from drA - even if Panini would have a sutra that would state it does not. So I guess MW accepted it as obvious.

Why the digitization spelling of the root is tvaga, when there is an 'i' anubandha. ?

This ones goes to @drdhaval2785

WIlson wrote his entry as he did. No doubt he was working with some DAtupAWa, but do we know which one?

By that date there was Rosen and Westergaard.

funderburkjim commented 8 years ago

Here is link to Westergaard.

In a 2010 file (WestergaardDhP.xml) was found:

<sutra msid="01.0091"><root wsid="5.14">uKa</root><root wsid="5.15">uKi</root>
<root wsid="5.16">vaKa</root><root wsid="5.17">vaKi</root><root wsid="5.18">maKa</root>
<root wsid="5.19">maKi</root><root wsid="5.20">RaKa</root><root wsid="5.21">RaKi</root>
<root wsid="5.22">raKa</root><root wsid="5.23">raKi</root><root wsid="5.24">laKa</root>
<root wsid="5.25">laKi</root><root wsid="5.26">iKa</root><root wsid="5.27">iKi</root>
<root wsid="5.28">IKi</root><root wsid="5.29" nomdp="yes">traKi</root>
<root wsid="5.30" nomdp="yes">traKa</root><root wsid="5.31" nomdp="yes">SiKi</root>
<root wsid="5.32" nomdp="yes">riKa</root><root wsid="5.33" nomdp="yes">riKi</root>
<root wsid="5.34" nomdp="yes">liKi</root><root wsid="5.35">valgi</root>
<root wsid="5.36">ragi</root><root wsid="5.37">lagi</root><root wsid="5.38">agi</root>
<root wsid="5.39">vagi</root><root wsid="5.40">magi</root><root wsid="5.41">tagi</root>
<root wsid="5.42">tvagi</root><root wsid="5.43">Sragi</root>
<root wsid="5.44" nomdp="yes">Svagi</root><root wsid="5.45">Slagi</root>
<root wsid="5.46">igi</root><root wsid="5.47">rigi</root><root wsid="5.48">ligi
</root><sense>gatyarTAH</sense></sutra>
<sutra msid="01.0092"><root wsid="5.49">tvagi</root><sense>kampane ca</sense></sutra>

So tvagi appears in 5.49 and 5.42 (section 5 in above display).

In particular, 'tvaga' is not found.

BTW, title page for Westergaard shows 1841, and for Wilson shows 1832.

gasyoun commented 8 years ago

BTW, title page for Westergaard shows 1841, and for Wilson shows 1832.

They had similar sources, Jim.

sanskrit-lexicon / WIL

Documentation of wilmwroots reason codes. #4