Closed funderburkjim closed 2 years ago
In working with non-althws entries (Vol. 1-6).txt, I found there to be 133 entries to examine. (Note: This does not include the items with additional comments in that file).
When reviewing these, I unexpectedly found over half where I disagree with the removal of althws.
Here's the list of L-numbers where I think the althws markup should be retained.
488, 882, 1086, 1180, 1539, 1629, 1683, 2008, 2064,
2249, 2810, 2883, 2885, 2893, 2924, 2991, 3129, 3280, 3365,
3446, 3511, 3538, 3542, 3551, 3615, 3813, 3883, 4034, 4094,
4098, 4345, 4366, 4595, 4826, 4839, 4867, 4927, 4946, 5504,
5646, 5828, 5888, 6102, 6244, 6251, 6255, 6270, 6376, 6425,
6441, 6691, 6746, 6747, 6759, 6814, 7024, 7052, 7071, 7110,
7164, 7168, 7178, 7287, 7402, 7438, 7594, 7713, 7717, 7749,
7950, 8156, 8220, 8247, 8347, 8376, 8437, 8508, 8626, 8675,
8757, 8776, 8971, 9027, 9139, 9321, 9381, 9387, 9402,
The argument for retaining the althws markup in these cases is that Volume 7 markup also has these marked as headwords (often in althws lists) with reference to vol 1-6 pwkvn (Arabic numeral reference).
@Andhrabharati Please re-examine these and see if you concur.
Cases L=7168, 8508, 8776 , 16168, and 17989 are slightly different -- let's put them aside until we get the big list above agreed upon.
My guiding principle is "taking the pwk main data as the reference to decide the alt. HWs" https://github.com/sanskrit-lexicon/PWK/issues/86#issuecomment-1120174602
Can you give the reason [other than "The argument for retaining the althws markup in these cases is that Volume 7 markup also has these marked as headwords (often in althws lists)"-- this point is also debatable, if & when the Part-2 text is handled] why you think these should be retained? Then I would be in a position to clarify my stand.
As I looked at random, many in the list you gave contain either the corrections which cannot be treated as alt. HWs, or "Nom." (or "Nom. abstr.") words, which (over 4000 of them) lie 'hidden' in the body portion of the main text.
"Nom." (or "Nom. abstr.") words, which (over 4000 of them) lie 'hidden' in the body portion of the main text.
So to be extracted or no? Should not we make them alternative headwords now?
My guiding principle is "taking the pwk main data as the reference to decide the alt. HWs"
As mentioned many times thus far, I look for consistency across the work/text, so unless the main text has all these "taken out" as new entries, I can't suggest doing it in the VN portion.
More important is to have the compound words inside the body "brought out" first, in both PWG and pwk.
Also the words with lexical change (gender or adv./adj. category) wrt the HW entry that are inside the body portion need to be separated out, in addition to the compound words.
But, as all this involves changing or introducing new L-numbers, this exercise looks to be beyond the acceptance by cdsl team.
Addition of L-nums is a settled activity. We do not change them much. Alternate HW or compounds can be 125.1, 125.2 etc.
compound words inside the body "brought out" first, in both PWG and pwk.
A long wanted one - list of samasas.
just a list? it can be easily got.
what I am saying is to make them all separated from the body portion as new entry words, as in MW.
I try to explain the reason for retaining the althws markup in the list ( 488, 882, ...) above. By looking at the first example L=488. I think the reason for other cases in the list will be similar.
Let's start by looking at the volume 7 entry whose text (on page 7-296-c) is (with no markup):
{#aDvaratva/#}, {#aDvarama#} und {#aDvaramaya#} I. 1.
This entry is clearly a list of three headwords. The text 'I. 1.' is interpreted to mean that these three words occur as headwords in
I.
volume one of pw(k)1.
in the VN material of volume one of pw(k).If we do a lookup for any one of these words in one of the combined pwkvn displays, then we
should see {#aDvaratva/#}, {#aDvarama#} und {#aDvaramaya#} I. 1.
in the 'pwkvn' panel of the display. Thus, the markup in pwkvn.txt for this entry is set as
<L>10669<pc>7-296-c<k1>aDvaratva<k2>aDvaratva/
<althws>{#aDvarama, aDvaramaya#}</althws>
<hw>{#aDvaratva/#}</hw>, <hw>{#aDvarama#}</hw> und <hw>{#aDvaramaya#}</hw> I. 1.
<LEND>
Now we also expect to see volume1 pwkvn entries in the display of any of these three words. It so happens in this case that these 3 words are spread over two pwkvn entries in volume 1:
<L>487<pc>1-286-c<k1>aDvaratva<k2>aDvaratva/
<hw>{#aDvaratva/#}</hw> <ab>n.</ab> <ab>Nom. abstr.</ab> zu {#aDvara#} 2) {%a%}) <ls>MAITR. S. 3, 6, 10.</ls>
<LEND>
<L>488<pc>1-286-c<k1>aDvaramaya<k2>aDvaramaya
<althws>{#aDvarama#}</althws>
<hw>{#aDvaramaya#}</hw>, lies <hw>{#aDvarama#}</hw>.
<LEND>
Now suppose that we remove the althws markup for L=488:
<L>488<pc>1-286-c<k1>aDvaramaya<k2>aDvaramaya
<hw>{#aDvaramaya#}</hw>, lies {#aDvarama#}.
<LEND>
Then the display for 'aDvarama' would not show the '{#aDvaramaya#}, lies {#aDvarama#}.' text. This omission would be wrong.
Thus, we need to retain the althws markup in L=488. This concludes the argument for althws markup in L=488
We might reasonably expect, from L=10669 text, that all 3 words would appear as headwords in PW(K).
But in fact aDvarama does NOT appear. Since the text {#aDvaramaya#}, lies {#aDvarama#}
is (I think) to be
read as a correction to PW, where aDvaramaya (old) should be changed to aDvarama (new). So it is understandable
that the 'new' word (aDvarama) does not appear in PW dictionary, since the correction has not been made.
Then the display for 'aDvarama' would not show the '{#aDvaramaya#}, lies {#aDvarama#}.' text. This omission would be wrong.
Agree.
{#aDvaramaya#}, lies {#aDvarama#}
Means that should not be no {#aDvaramaya#}
at all. I would love to know a a full list of such words before the German lies
= to be read as.
is (I think) to be read as a correction to PW
Yes.
These corrections have been made. These are from the 'comment' items in non-althws entries (Vol. 1-6).txt and a couple of additions by me. non-althws-simple_special.txt
Can you give the reason [other than "The argument for retaining the althws markup in these cases is that Volume 7 markup also has these marked as headwords (often in althws lists)"-- this point is also debatable, if & when the Part-2 text is handled] why you think these should be retained? Then I would be in a position to clarify my stand.
As I had already mentioned above, the vol.7 index of VN entries has quite many issues to debate upon, and I cannot consider it as a reference, except for the corrections and new entries it has.
And just for about 80 words, there is no point spending time and debating time and again; there are many more areas where the works needs to be done.
I would just like to conclude my stand thus (having "seen" and "worked" on over a 150-200 dictionaries so far)--
Have modified the 50 or so althws, excluding those in the '488,...' list. Also several other corrections noted by @Andhrabharati and me.
Work appears in https://github.com/sanskrit-lexicon/PWK/tree/master/pwkvn/step1/althws.
Also several other corrections noted by @Andhrabharati and me.
Good to have you both. The strict gatekeeper @funderburkjim and the revolutionary @Andhrabharati
This issue begun in #86. It seems to be a bit more complicated than expected, so let's devote further discussion of althws editing in pwkvn to this new issue.