sanskrit-lexicon / PWK

Sanskrit-Wörterbuch in kürzerer Fassung, 7 Bände Petersburg 1879-1889
3 stars 1 forks source link

PWKVN althws editing #87

Closed funderburkjim closed 2 years ago

funderburkjim commented 2 years ago

This issue begun in #86. It seems to be a bit more complicated than expected, so let's devote further discussion of althws editing in pwkvn to this new issue.

funderburkjim commented 2 years ago

In working with non-althws entries (Vol. 1-6).txt, I found there to be 133 entries to examine. (Note: This does not include the items with additional comments in that file).

When reviewing these, I unexpectedly found over half where I disagree with the removal of althws.
Here's the list of L-numbers where I think the althws markup should be retained.

488,  882, 1086, 1180, 1539, 1629, 1683, 2008, 2064,
2249, 2810, 2883, 2885, 2893, 2924, 2991, 3129, 3280, 3365,
3446, 3511, 3538, 3542, 3551, 3615, 3813, 3883, 4034, 4094,
4098, 4345, 4366, 4595, 4826, 4839, 4867, 4927, 4946, 5504,
5646, 5828, 5888, 6102, 6244, 6251, 6255, 6270, 6376, 6425,
6441, 6691, 6746, 6747, 6759, 6814, 7024, 7052, 7071, 7110,
7164, 7168, 7178, 7287, 7402, 7438, 7594, 7713, 7717, 7749,
7950, 8156, 8220, 8247, 8347, 8376, 8437, 8508, 8626, 8675,
8757, 8776, 8971, 9027, 9139, 9321, 9381, 9387, 9402, 

The argument for retaining the althws markup in these cases is that Volume 7 markup also has these marked as headwords (often in althws lists) with reference to vol 1-6 pwkvn (Arabic numeral reference).

@Andhrabharati Please re-examine these and see if you concur.

Cases L=7168, 8508, 8776 , 16168, and 17989 are slightly different -- let's put them aside until we get the big list above agreed upon.

Andhrabharati commented 2 years ago

My guiding principle is "taking the pwk main data as the reference to decide the alt. HWs" https://github.com/sanskrit-lexicon/PWK/issues/86#issuecomment-1120174602

Can you give the reason [other than "The argument for retaining the althws markup in these cases is that Volume 7 markup also has these marked as headwords (often in althws lists)"-- this point is also debatable, if & when the Part-2 text is handled] why you think these should be retained? Then I would be in a position to clarify my stand.

Andhrabharati commented 2 years ago

As I looked at random, many in the list you gave contain either the corrections which cannot be treated as alt. HWs, or "Nom." (or "Nom. abstr.") words, which (over 4000 of them) lie 'hidden' in the body portion of the main text.

gasyoun commented 2 years ago

"Nom." (or "Nom. abstr.") words, which (over 4000 of them) lie 'hidden' in the body portion of the main text.

So to be extracted or no? Should not we make them alternative headwords now?

Andhrabharati commented 2 years ago

My guiding principle is "taking the pwk main data as the reference to decide the alt. HWs"

As mentioned many times thus far, I look for consistency across the work/text, so unless the main text has all these "taken out" as new entries, I can't suggest doing it in the VN portion.

More important is to have the compound words inside the body "brought out" first, in both PWG and pwk.

Andhrabharati commented 2 years ago

Also the words with lexical change (gender or adv./adj. category) wrt the HW entry that are inside the body portion need to be separated out, in addition to the compound words.

Andhrabharati commented 2 years ago

But, as all this involves changing or introducing new L-numbers, this exercise looks to be beyond the acceptance by cdsl team.

drdhaval2785 commented 2 years ago

Addition of L-nums is a settled activity. We do not change them much. Alternate HW or compounds can be 125.1, 125.2 etc.

gasyoun commented 2 years ago

compound words inside the body "brought out" first, in both PWG and pwk.

A long wanted one - list of samasas.

Andhrabharati commented 2 years ago

just a list? it can be easily got.

what I am saying is to make them all separated from the body portion as new entry words, as in MW.

funderburkjim commented 2 years ago

argument for althws markup in L=488

I try to explain the reason for retaining the althws markup in the list ( 488, 882, ...) above. By looking at the first example L=488. I think the reason for other cases in the list will be similar.

Let's start by looking at the volume 7 entry whose text (on page 7-296-c) is (with no markup): {#aDvaratva/#}, {#aDvarama#} und {#aDvaramaya#} I. 1.

This entry is clearly a list of three headwords. The text 'I. 1.' is interpreted to mean that these three words occur as headwords in

If we do a lookup for any one of these words in one of the combined pwkvn displays, then we should see {#aDvaratva/#}, {#aDvarama#} und {#aDvaramaya#} I. 1. in the 'pwkvn' panel of the display. Thus, the markup in pwkvn.txt for this entry is set as

<L>10669<pc>7-296-c<k1>aDvaratva<k2>aDvaratva/
<althws>{#aDvarama, aDvaramaya#}</althws>
<hw>{#aDvaratva/#}</hw>, <hw>{#aDvarama#}</hw> und <hw>{#aDvaramaya#}</hw> I. 1. 
<LEND>

Now we also expect to see volume1 pwkvn entries in the display of any of these three words. It so happens in this case that these 3 words are spread over two pwkvn entries in volume 1:

<L>487<pc>1-286-c<k1>aDvaratva<k2>aDvaratva/
<hw>{#aDvaratva/#}</hw> <ab>n.</ab> <ab>Nom. abstr.</ab> zu {#aDvara#} 2) {%a%}) <ls>MAITR. S. 3, 6, 10.</ls> 
<LEND>

<L>488<pc>1-286-c<k1>aDvaramaya<k2>aDvaramaya
<althws>{#aDvarama#}</althws>
<hw>{#aDvaramaya#}</hw>, lies <hw>{#aDvarama#}</hw>. 
<LEND>

Now suppose that we remove the althws markup for L=488:

<L>488<pc>1-286-c<k1>aDvaramaya<k2>aDvaramaya
<hw>{#aDvaramaya#}</hw>, lies {#aDvarama#}. 
<LEND>

Then the display for 'aDvarama' would not show the '{#aDvaramaya#}, lies {#aDvarama#}.' text. This omission would be wrong.

Thus, we need to retain the althws markup in L=488. This concludes the argument for althws markup in L=488

comment on pw I.

We might reasonably expect, from L=10669 text, that all 3 words would appear as headwords in PW(K). But in fact aDvarama does NOT appear. Since the text {#aDvaramaya#}, lies {#aDvarama#} is (I think) to be read as a correction to PW, where aDvaramaya (old) should be changed to aDvarama (new). So it is understandable that the 'new' word (aDvarama) does not appear in PW dictionary, since the correction has not been made.

gasyoun commented 2 years ago

Then the display for 'aDvarama' would not show the '{#aDvaramaya#}, lies {#aDvarama#}.' text. This omission would be wrong.

Agree.

{#aDvaramaya#}, lies {#aDvarama#}

Means that should not be no {#aDvaramaya#} at all. I would love to know a a full list of such words before the German lies = to be read as.

is (I think) to be read as a correction to PW

Yes.

funderburkjim commented 2 years ago

corrections for special cases

These corrections have been made. These are from the 'comment' items in non-althws entries (Vol. 1-6).txt and a couple of additions by me. non-althws-simple_special.txt

Andhrabharati commented 2 years ago

Can you give the reason [other than "The argument for retaining the althws markup in these cases is that Volume 7 markup also has these marked as headwords (often in althws lists)"-- this point is also debatable, if & when the Part-2 text is handled] why you think these should be retained? Then I would be in a position to clarify my stand.

As I had already mentioned above, the vol.7 index of VN entries has quite many issues to debate upon, and I cannot consider it as a reference, except for the corrections and new entries it has.

And just for about 80 words, there is no point spending time and debating time and again; there are many more areas where the works needs to be done.

I would just like to conclude my stand thus (having "seen" and "worked" on over a 150-200 dictionaries so far)--

  1. The entry words in a general dictionary are limited to the the lexical categories of m., f., n., adj., adv. and verb (or root), and any "inflected form" is not elevated to HW-level except in some "special" dictionaries.
  2. If more words are grouped together at HW-level, they are mostly having the above lexical terms in between (if at all) or are "comma, and, or" separated; but not with general works like "see, read, ..." which mostly forms the body portion or correction text.
  3. Jim might consider correcting the other 50-odd entries in my original list, and forget about the 80+ words he has listed.
  4. Whatever the stand one adopts, it has to be applied throughout a particular work, if not across all the works. So, such corrections are to be done not just in the VN portion, but in the main text as well.
funderburkjim commented 2 years ago

althws modifications

Have modified the 50 or so althws, excluding those in the '488,...' list. Also several other corrections noted by @Andhrabharati and me.

Work appears in https://github.com/sanskrit-lexicon/PWK/tree/master/pwkvn/step1/althws.

gasyoun commented 2 years ago

Also several other corrections noted by @Andhrabharati and me.

Good to have you both. The strict gatekeeper @funderburkjim and the revolutionary @Andhrabharati