funderburkjim / kosha-dev

Develop xml and html for anekArthaka and samAnArthaka Sanskrit dictionaries
1 stars 1 forks source link

v4 - abch - Abhidhānacintāmaṇi of Hemacandrācārya #12

Closed funderburkjim closed 1 year ago

funderburkjim commented 1 year ago

First implementation of a samānārthaka kosha.

funderburkjim commented 1 year ago

Sample url: https://sanskrit-lexicon.uni-koeln.de/work/kosha-dev/v4/apps/abch/web/webtc2/index.php

image
funderburkjim commented 1 year ago

@drdhaval2785 request your review.

Note no 'abch1.xml' created. Not sure whether your xml_xml1.py is to be adapted to abch.

Where are scanned images for both anhk and abch?

drdhaval2785 commented 1 year ago

Thank you very much Jim for this implementation.

drdhaval2785 commented 1 year ago

I will check whether xml_xml1.py is adapted to this or not. Will find out good scans for ANHK and ABCH.

drdhaval2785 commented 1 year ago

48A887DD-2F2B-4E2D-BF71-E29346898F23

in the above example, there is no need to show “DenukA, vaSA” synonym set. Only the synset having the word “hastin” is to be displayed.

drdhaval2785 commented 1 year ago

ABCH scan - https://archive.org/details/abhidhanasangrahapanditsivadattadurgaprasadakasinathpandurangparabnirnayasagar_613_y/page/n131/mode/1up Page 135 onwards

drdhaval2785 commented 1 year ago

ANHK - https://archive.org/details/anekarthanamamala/page/n3/mode/1up

It is somewhat peculiar. Scan has two pages on one page, and has fingers of person scanning. If need be, we can do some manual corrections like cropping etc to make it more usable.

funderburkjim commented 1 year ago

@drdhaval2785 Sorry to be so long in getting back to this.

Any theories why the धेनुका वशापि was included ?

My understanding of how to interpret this kosha is quite low.

For instance, what is the correspondence between abch.txt entries (L=...) and verses? A simpler relation would be one verse per one L and all the words in a verse are synonyms.

A second question is seen in L=1531 Here there is only one word in each of 4 syns fields --- What is it supposed to mean the Badra is a synonym? A synonym of what? Surely a list of synonyms must have at least two words.

So, I'm in need of some tutoring in how to read this book.

drdhaval2785 commented 1 year ago

Original verses

हस्ती मतङ्गजगजद्विपकर्यनेकपा मातङ्गवारणमहामृगसामयोनयः ।
स्तम्बेरमद्विरदसिन्धुरनागदन्तिनो दन्तावलः करटिकुञ्जरकुम्भिपीलवः ॥ १२१७ ॥
इभः करेणुर्गर्जोऽस्य स्त्री धेनुका वशापि च ।
भद्रो मन्द मृगो मिश्रश्चतस्रो गजजातयः ॥ १२१८ ॥

The problem with annotating one verse per L is as follows - The synonym sets are of arbitrary length. In some cases, it may be a quarter of a verse. In some cases, it may be running across 4-5 verses. In the present case, synonyms of an elephant runs throughout 1217 verse and a quarter of 1218. हस्ती मतङ्गजगजद्विपकर्यनेकपा मातङ्गवारणमहामृगसामयोनयः । स्तम्बेरमद्विरदसिन्धुरनागदन्तिनो दन्तावलः करटिकुञ्जरकुम्भिपीलवः ॥ १२१७ ॥ इभः करेणुर्गर्जः The second quarter of 1218 is a synset of female elephant अस्य स्त्री धेनुका वशापि च । The third and fourth quarter of 1218 are not technically synonyms. They are types of elephants (Badra, manda, mfga, miSra). They are not synonyms to each other. They are types of elephant. भद्रो मन्द मृगो मिश्रश्चतस्रो गजजातयः ॥ १२१८ ॥ Therefore, they are enumerated separately.

It is unnecessary to break verse in between इभः करेणुर्गर्जः and अस्य स्त्री धेनुका वशापि च । Therefore, both synsets 3078 and 3079 refer to the same dictionary entry i.e. L number 1530.

Annotated version

<L>1530<pc>47
<eid>3087<syns>हस्तिन्-पुं,मतङ्गज-पुं,गज-पुं,द्विप-पुं,करिन्-पुं,अनेकपा-पुं,मातङ्ग-पुं,वारण-पुं,महामृग-पुं,सामयोनि-पुं,स्तमेरम-पुं,द्विरद-पुं,सिन्धु-पुं,नाग-पुं,दन्तिन्-पुं,दन्तावल-पुं,करटिन्-पुं,कुञ्जर-पुंक्ली,कुम्भिन्-पुं,पीलु-पुं,इभ-पुं,करेणु-पुंस्त्री,गर्ज-पुं
<eid>3088<syns>धेनुका-स्त्री,वशा-स्त्री
हस्ती मतङ्गजगजद्विपकर्यनेकपा मातङ्गवारणमहामृगसामयोनयः ।
स्तम्बेरमद्विरदसिन्धुरनागदन्तिनो दन्तावलः करटिकुञ्जरकुम्भिपीलवः ॥ १२१७ ॥
इभः करेणुर्गर्जोऽस्य स्त्री धेनुका वशापि च ।
<LEND>
<L>1531<pc>47
<eid>3089<syns>भद्र-पुं
<eid>3090<syns>मन्द-पुं
<eid>3091<syns>मृग-पुं
<eid>3092<syns>मिश्र-पुं
<eid>3093<syns>गजजाति-स्त्री
भद्रो मन्द मृगो मिश्रश्चतस्रो गजजातयः ॥ १२१८ ॥
<LEND>
drdhaval2785 commented 1 year ago

It is quite easy to understand these kind of dictionaries. They are not compiled word by word. They are compiled based on semantic relationships.

तिर्यक्काण्ड (other than humans) -> पञ्चेन्द्रियाः (those with five sense organs) -> हस्तिन् (elephant) -> Everything associated with an elephant (like its wife, its type, different nomenclature according to age of the elephant, different nomenclature based on usage, its body organs, instruments for controlling an elephant, instruments of decorating and elephant etc)

drdhaval2785 commented 1 year ago

Now when we are sure that the user has asked for 'hastin', it would be useful only to show eid 3087. It would not make sense to show him the second synset having 'DenukA' i.e. eid 3088.

drdhaval2785 commented 1 year ago

Regarding your question qua 1531, these are types of elephants. They are not synonyms - rather they are distinctive types of elephants. Therefore, it would not be fine to club them together as synset. Therefore, they are shown separately. One per entry. No synonym is provided. But as the headword is there, we should allow the user to reach to this entry if he searches for 'Badra' or 'manda' etc.

drdhaval2785 commented 1 year ago

@funderburkjim I hope that this clarifies your doubts regarding how to parse this dictionary.

funderburkjim commented 1 year ago

Your comments appreciated. My doubts are clarified in part. More discussion required. Will try to formulate other doubts in next few comments.

funderburkjim commented 1 year ago

fix DenukA problem

Change to make_xml.py. See commit above.

funderburkjim commented 1 year ago

hastin display

Showing that the DenukA synset is not displayed, as desired.

image
funderburkjim commented 1 year ago

status of manda question

Just note here that solving the DenukA problem also impacts the display of manda, mentioned above.

image

There is more to discuss re manda. Let's do that in another issue, and consider this first issue closed.

drdhaval2785 commented 1 year ago

Thanks Jim. It is much better now.