sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

MW compounds below parent headword #315

Open drdhaval2785 opened 3 years ago

drdhaval2785 commented 3 years ago

Problem

After we have upgraded MW compounds to full fledged headwords, a user has sent the following mail

I was wondering why the MW99 left out the word endings for the HWs, after putting that much effort for so
 many years. However they are retained in the later addition of MW72. 
Can you find out the background for this?

So, it seems that there are still people out there who would like to see the compounds enlisted under the headwords.

Solution

  1. There is no problem in showing the compounds just below the headword, as we earlier did. If I search for let's say ID 4024, I should be seeing entries 4024.1, 4024.2..... etc. just below 4024 entry. Not a difficult task programmatically. Right, @funderburkjim ?

  2. We can give the user the preference to choose whether he would like the compounds below the headwords or not.

  3. We can ignore such hue and cry and ask them to use 'prefix' module while searching.

I prefer solution 1. What do others feel.

gasyoun commented 3 years ago

still people out there who would like to see the compounds enlisted under the headwords.

And we can sure understand them. 3 is not what we want, if Jim can code it. To show by default 1 can mage the page huge. Maybe to have a list of headwords bellow, without the entry text?

funderburkjim commented 3 years ago

I don't understand the '4024' example.

In MW, L=4024 is headword aDavA, and it has no listed compounds, and there is no L=4024.1 in MW.

Please explain again the problem.

drdhaval2785 commented 3 years ago

I did not mean to have 4024 as precise number. It was supposed to be an example.

When we search for a headword, compounds should also be listed below.

gasyoun commented 3 years ago

When we search for a headword, compounds should also be listed below.

At least as on option, right.

funderburkjim commented 3 years ago

In trying to read Dhaval's mind, here's what results.

In MW, we have markup of the author's 4 lines of headwords: H1-4.

The main two lines (H1 and H2) can have what are usually compounds.

For example, headword aṃśa is an H1 headword with several compounds:

image

I conjecture that it is desired to make displays of aṃśa somehow provide this list of compounds of aṃśa .

Is this the idea?

funderburkjim commented 3 years ago

What are the compounds

I made some displays illustrating all the direct compounds of MW H1/2 headwords. These are, by definition, the H3 headwords that follow a particular H1/2 headword.

(H3 headwords can also have compounds, which are H4 headwords; but these are not included in the following).

These displays show all the H1/2 headwords which have following H3 entries.

Some random statistics:

Verbs with compounds?

There are 1247 H1/2 headwords which

For example: 0004:niḥśvas VERB:niḥśvasana niḥśvasita niḥśvasya niḥśvāsa

The H3 children are kṛdantas (I think that's the right grammatical term) in most cases, rather than samāsas.

funderburkjim commented 3 years ago

display addition needed?

There are some partial other ways to get at the compounds.

Advanced search for headwords with a given prefix.

For example, using prefix aṃśa results in 16 items:

image

Compare this to the 18 H3 children of compounds.txt:

0018:aṃśa:+karaṇa +kalpanā +prakalpanā +pradāna +bhāgin +bhāj +bhū +bhūta 
+rūpiṇī +vat +savarṇana +svara +hara +hārin aṃśā@ṃśa 
aṃśā@ṃsi aṃśā@vataraṇa aṃśīkṛ

Note: the compound aṃśarūpiṇī in this list is not in the book (see image above). Reason: aṃśarūpiṇī is from the Supplement (p. 1308) Otherwise, the printed list and the list above agree.

The difference between Advance Search and compounds.txt

So 18 - 4 (compound.txt) = 16 - 2 (Advanced Search) . All comparisons accounted for.

funderburkjim commented 3 years ago

Simple Search

Using Simple-search with input IAST, typing citation aṃśa gives some of the same list as Advanced Search: image

And if the user clicks the down triangle below aṃśasavarṇana, the list shows the rest in compounds.txt (along with a few more in book order): image

funderburkjim commented 3 years ago

I am ambivalent as to the need of inserting the list of compounds into the MW display results, given that two other displays provide similar information.

If we were to decide to insert the list of compounds into the MW display, we would have to decide:

What do others think?

gasyoun commented 3 years ago

I will not get tired saying how I adore the way Jim documents things.

I conjecture that it is desired to make displays of aṃśa somehow provide this list of compounds of aṃśa . Is this the idea?

Yes.

Some random statistics

Mesmerizing

http://funderburkjim.github.io/MWderivations/compounds/compounds.html

Can we have a row where the total number of children is there as well, please?

a-karmā@nvita

@ stands for sandhi?

12609 records in the table

For a 200k word dictionary that is not a lot. If H4 children are counted, how many add?

The H3 children are kṛdantas (I think that's the right grammatical term) in most cases, rather than samāsas.

Interesting catch.

For example, using prefix aṃśa results in 16 items

Yes, it gives the same results, but you have to know in advance that aṃśa has to be searched in this role. Additional substantial effort is required.

Note: the compound aṃśarūpiṇī in this list is not in the book (see image above). Reason: aṃśarūpiṇī is from the Supplement (p. 1308) Otherwise, the printed list and the list above agree.

The better for us.

Last two words in Advanced search (aṃśaka and aṃśala) are not in compounds.txt

Can we have a 3rd list, where they all are enlisted? Should we?

Using Simple-search with input IAST, typing citation aṃśa gives some of the same list as Advanced Search:

Yes, but non-aṃśa words as well. So it's harder to weed them out. There is not easy way to see the beginning and the end at once.

And if the user clicks the down triangle below aṃśasavarṇana, the list shows the rest in compounds.txt (along with a few more in book order):

But looses the beginning and can't copy-paste it.

Is the list in compounds.txt what should be shown?

Makes sense.

What UI should be used to accomplish this in a useful and visually attractive way?

A plain bulleted list to those articles?