sanskrit-lexicon / csl-orig

Data for all dictionaries of Cologne. Now all corrections are made in this git-based workflow.
14 stars 10 forks source link

Correction backlog, mw_todo items #1643

Closed funderburkjim closed 4 months ago

funderburkjim commented 4 months ago

This issue for reconsidering items mentioned in mw_todo.txt posted at #1639.

funderburkjim commented 4 months ago

incomplete Dhātup. references

Usually, the Dhātup. references (to Westergaard) in mw.txt have the form <ls>Dhātup. xi, 15</ls>. But a few are missing the subsection number (e.g. 15).

mw_todo_dhatup.txt has proposed changes, all of which will be added to print changes. @Andhrabharati There was one I could not find (root JF (back to slp1!)) in section xxvi. Can you find the subsection of Section 26 (xxvi)?

Andhrabharati commented 4 months ago

JF (जॄ) is at xxvi, 23 (झॄष् वयोहानौ) [for cl. 4] & xxxi, 24 (जॄ वयोहानौ, झॄ इत्येके, धॄ इत्यन्ये) [for cl. 9].

xxvi,23 image

xxxi,24 image

PWG (L-28101) has it thus-- {#Jar (JF), JI/ryati#} und {#JfRAti#} = {#jar#} {%altern%} <ls>DHĀTUP. 26,23. 31,24, <ab>v. l.</ab></ls>

And its counterpart jFz is at xxvi, 22 [जॄष् (वयोहानौ)], @funderburkjim !

funderburkjim commented 4 months ago

बृहट्टिक ->बृहट्टिक्क

This is a good detective story -- Ending with a print change in MW as @aumsanskrit originally suggested. I'll install this change (print change) to MW unless there are objections.

https://gist.github.com/funderburkjim/ba0316389ccc22f4718ee58ee9048e0e

Andhrabharati commented 4 months ago

Just thought of "vetting" Jim's opening statement of mw_todo_dhatup.txt

For some reason, MW did not put the subsection numbers in this reference

There are only a few similar: 5 matches for "<ls>Dhātup. [xivl]+</ls>" in buffer: mw.txt

and a quick look into mw.txt data gave these results--

<ls n="Dhātup.">xxviii.</ls>    1
<ls>Dhāt.</ls>  2  ;; not even having section number
<ls>Dhātup.</ls>    213  ;; not even having section number
<ls>Dhātup. i.</ls> 1
<ls>Dhātup. iii.</ls>   1
<ls>Dhātup. xi.</ls>    1
<ls>Dhātup. xiii.</ls>  1
<ls>Dhātup. xiv.</ls>   1
<ls>Dhātup. xv.</ls>    1
<ls>Dhātup. xvi.</ls>   2
<ls>Dhātup. xix.</ls>   1
<ls>Dhātup. xx.</ls>    1
<ls>Dhātup. xxxii.</ls> 1

Jim might like to take some action against these.

——————————— P.S. These counts are from my revision file (which had the repetition lines for grouped entries etc. discarded); so the CDSL file counts could be more at some places.

funderburkjim commented 4 months ago

mw_todo_misc1.txt several submitted by Scott which Jim thinks have solution.

mw_todo_insoluble_by_jim.txt submitted by Scott. Jim has found no 'solution'.

funderburkjim commented 4 months ago

mw_todo_dhatup1.txt subsection numbers for the additional cases AB mentions above.

L=86537, tfR Dhātup. i. not found. There is a reference to tfR at Dhātup. xxx, 6 - Should we change to Dhātup. xxx, 6 -?

funderburkjim commented 4 months ago
32327 matches in 32301 lines for "</lex> <ab>N.</ab>"
287 matches for "</lex>, <ab>N.</ab>" in buffer: mw.txt

Small random sample found no ',' after lex tag.
Conclude:
Global change.
"</lex>, <ab>N.</ab>" -> "</lex> <ab>N.</ab>"  287 changes.
Andhrabharati commented 4 months ago

L=86537, tfR Dhātup. i. not found. There is a reference to tfR at Dhātup. xxx, 6 - Should we change to Dhātup. xxx, 6 -?

It is , in fact, '1.' and not 'i.' here; as such the <ls>Dhātup. i.</ls> to be changed to <ls>Dhātup.</ls>

and then '1.' to be attached to the next word 'tṛta', i.e.,

<L>86538<pc>453,1<k1>tṛta<k2>tṛta<h>a<e>3 -> 
<s>tṛta</s> <hom>a</hom> ¦ 

to be changed as

<L>86538<pc>453,1<k1>tṛta<k2>tṛta<h>1<e>3
<hom>1.</hom> <s>tṛta</s> ¦

There were couple of places where this "1. <> i." problem was corrected (in my working) and I had somehow missed this one; but got identified now (for good)!

Cf. <L>86686 which mentions <hom>1.</hom> and <hom>2.</hom> <s>tṛta</s>; but <hom>2.</hom> <s>tṛta</s> could not be seen at/under <s>tritá</s> (L-88580). How do we "correct" this and get back <hom>2.</hom> <s>tṛta</s> somehow?

funderburkjim commented 4 months ago
37 matches for "</ab> -" in buffer: mw.txt  
  most of these to be changed to "</ab>-"  

The commit a65c65d shows the changes.

funderburkjim commented 4 months ago

tfta 86686

How about this?

OLD:
<L>86686<pc>453,3<k1>tfta<k2>tfta<h>b<e>1
<hom>1.</hom> and <hom>2.</hom> <s>tfta</s> <hom>b</hom>. ¦ See √ <s>tfR</s> and <s>trita/</s>.
<LEND>

NEW:
<L>86686<pc>453,3<k1>tfta<k2>tfta<h>a<e>1
<s>tfta</s> <hom>a<hom> ¦ [For <hom>1.</hom> of <s>tfta</s>, see √ <s>tfR</s>.]
<LEND>
<L>86686.1<pc>453,3<k1>tfta<k2>tfta<h>2<e>1
<hom>2.</hom> <s>tfta</s>¦ See <s>trita/</s>.
<LEND>

and 86538 as you have written

<L>86538<pc>453,1<k1>tfta<k2>tfta<h>1<e>3
<hom>1.</hom> <s>tfta</s> ¦ <lex>mfn.</lex>, eaten <ab>g.</ab> <s>tanoty-Adi</s>.<info lex="m:f:n"/>
<LEND>
Andhrabharati commented 4 months ago

Now, we have <ls n="Dhātup.">iv f.</ls> at 3 places (in CDSL) and at one grouped entry (in AB).

Should this get an Indo-Arabic number inside?

Andhrabharati commented 4 months ago

tfta 86686

How about this?

Your suggestion is a bit too-much of a change, but still just retaining the original "sense" as is. Probably we could leave the text "as is" at L- 86686 (at least for now!). [However the <hom>b</hom> to be deleted in the CDSL text here; AB version, anyway, doesn't contain it throughout!]

funderburkjim commented 4 months ago

Re 86686

OK will leave untouched for now -- but I think my suggested recoding is much clearer!

Re traN[kKg]: I think we should remove the 'f.' and put the 'correct' reference, different for each 1. These will be print changes.

<L>87413<pc>457,1<k1>traNk<k2>traNk<e>1
 294902:<s>traNk</s> ¦ <s>°NK</s>, <s>°Ng</s> <ab>cl.</ab> 1. <ab>id.</ab>, <ls n="Dhātup.">iv f.</ls><info verb="root" cp="1"/>
OLD: <ls n="Dhātup.">iv f.</ls>
NEW: <ls n="Dhātup.">iv, 23</ls>  त्रकि

---------------------------
<L>87413.1<pc>457,1<k1>traNK<k2>traNK<e>1
 294905:<s>traNK</s> ¦ <s>°Nk</s>, <s>°Ng</s> <ab>cl.</ab> 1. <ab>id.</ab>, <ls n="Dhātup.">iv f.</ls><info verb="root" cp="1"/>
OLD: <ls n="Dhātup.">iv f.</ls>
NEW: <ls n="Dhātup.">v, 30</ls>  त्रखि

---------------------------
<L>87413.2<pc>457,1<k1>traNg<k2>traNg<e>1
 294908:<s>traNg</s> ¦ <s>°Nk</s>, <s>°NK</s> <ab>cl.</ab> 1. <ab>id.</ab>, <ls n="Dhātup.">iv f.</ls><info verb="root" cp="1"/>
OLD: <ls n="Dhātup.">iv f.</ls>
NEW: <ls n="Dhātup.">v, 42</ls>  त्रगि

Agree?

Andhrabharati commented 4 months ago

Pl. go through this, @funderburkjim -- mw_todo_insoluble (AB response).txt

Andhrabharati commented 4 months ago

OK will leave untouched for now -- but I think my suggested recoding is much clearer!

I do agree that your recoding is clearer, but there are many places where such "correlative" referencing is present in MW & this would be just another one in the lot, i.e.,

<hom>1.</hom> and <hom>2.</hom> <s>tfta</s>. ¦ See √ <s>tfR</s> and <s>trita/</s>. indicating

<hom>1.</hom> <s>tfta</s>. ¦ See √ <s>tfR</s. and <hom>2.</hom> <s>tfta</s>. ¦ See <s>trita/</s>.

Andhrabharati commented 4 months ago

Re traN[kKg]: I think we should remove the 'f.' and put the 'correct' reference, different for each 1. These will be print changes.

Agree?

Let me take some time for this.

Andhrabharati commented 4 months ago

mw_todo_misc1.txt several submitted by Scott which Jim thinks have solution.

Here is my response to some of the entries that I felt necessary to talk on-- mw_todo_misc1 (AB response).txt

Andhrabharati commented 4 months ago

Case 627: AB: Let me go through the Pravara texts and "try" to resolve the matter by the end-of-the-day.

Tried to locate khārdamāyana & kārdamāyana in the Pravara texts and found both of them interlinked (as variants) [in Gotra Pravara Manjari]--

image

image

kārdamāyana image

As such, no print correction is required here.

Andhrabharati commented 4 months ago

Re traN[kKg]: I think we should remove the 'f.' and put the 'correct' reference, different for each 1. These will be print changes. Agree?

Let me take some time for this.

Yes, these three roots at these resp. places; but could we mark them in the single entry itself?

<L>87413<pc>457,1<k1>traNk<k2>traNk<e>1
<s>traNk</s>, <s>°NK</s>, <s>°Ng</s> ¦ <ab>cl.</ab> 1. <ab>id.</ab>, <ls n="Dhātup.">iv, 23</ls>; <ls n="Dhātup.">v, 30</ls>; <ls n="Dhātup.">v, 42</ls>.<info verb="root" cp="1"/>

I do not wish to disturb the text file structure too much from the print matter; however we can split these grouped entries into individual sub-entries in the xml file, as done in GRA [which course of action has already been talked about].

funderburkjim commented 4 months ago

Several (maybe 150) non-controversial changes in latest commit.

funderburkjim commented 4 months ago

About 40 sanskrit word spelling corrections. Details in the commits above.

funderburkjim commented 4 months ago

minor edits, 1

funderburkjim commented 4 months ago

@aumsanskrit mentioned preferences for 'm. n.' over 'mn.'
Went ahead a made this change (<lex>mn.</lex> -> <lex>m.</lex> <lex>n.</lex>) About 600 changes

Similarly with <lex>mf.</lex> -> <lex>m.</lex> <lex>f.</lex> About 150 changes.

This brings mw.txt slightly more close to print.

funderburkjim commented 4 months ago

two mw variations

These are observations only, no action on mw.txt.

132 matches for "<ls>KāśīKh.</ls>" in buffer: mw.txt . In print, some of these are spelled Kāśīkh (e.g. under hw = satpaTIna) Some are spelled KāśīKh (e.g. under hw = QuRQi). mw.txt uses only KāśīKh

6 matches for "Piper Betle" in buffer: mw.txt (e.g. hw = dAhadA) 9 matches for "Piper Betel" in buffer: mw.txt (e.g. hw = kuhali) The current scientific name is "Piper Betle", I think. For some purposes, it might be good to change "Betel" to "Betle".

funderburkjim commented 4 months ago

mw_todo_misc1.AB.response.txt

  • PWG's source Aufrecht has the alt. form "saukaraka"
  • Aufrecht to be in error (an extremely rare case), talking about one manuscript's content in the Oxford Library collection, that is the basis for PWG

@Andhrabharati What is Aufrecht source that you mention?

Andhrabharati commented 4 months ago

Verz. d. Oxf. H. [Remember, you were to do the "linking" of this at PWG and others!]

Andhrabharati commented 4 months ago

BTW, your latest correction at L-81507 metaline

image

is unwarranted, only adding another entity to the k1-k2 differring list!!

aumsanskrit commented 4 months ago

Regarding the Headword "khārdamāyana", please recall that MW dictionary makes the following reference, "cf. kārd°. [ID=61878]".  khārdamāyana

Andhrabharati has confirmed that the word, "kārdamāyana" exists as a variation, but my understanding is that a variation is NOT the same as "cf. kārd°". Therefore, I am proposing a "print change" as follows:

"cf. kārd°" should be changed to "v.l. kārd°", because MW dictionary does not include the Headword "kārdamāyana".

Andhrabharati commented 4 months ago

Very good point raised, Scott!

There are more such places that have cf. items that are not present in the same dictionary.

Probably we should make it a point to identify (sometime) and "correct" them as appropriate!

Jim, if you are willing, I have a strategy to deal with all such stuff, incl. cf. <s>...</s>; See (or see) <s>...</s>; = <s>...</s>; and of course, the ubiquitous <s>...</s> q.v.

Andhrabharati commented 4 months ago

because MW dictionary does not include

Scott, I would like to bring to your notice my response against Case 142 in https://github.com/sanskrit-lexicon/csl-orig/issues/1643#issuecomment-2167102546; wherein (with the proposed change) the <i>testudo</i> remains under cf., but would not be "traceable" elsewhere within MW.

So the correction won't always be changing cf. to v.l., but could be correcting the typo (or print) error, or changing the markup [which renders the the text as per print, but may not be present elsewhere within the dictionary].

aumsanskrit commented 4 months ago

Thank you, Andhrabharati. I understand your point. One question remains regarding "cf. testudo". How do we mark this word "testudo" so that it is understood to be a classical Latin word and not a Sanskrit word. Of course, I was initially thinking this was a reference to a Sanskrit word "testudo" which of course does not exist. There should be a way to clarify such references when they are NOT to a Sanskrit word.

Andhrabharati commented 4 months ago

One option is to mark it as a "zoo(logical)" entity, which would be shown in a different style than the rest of the surrounding text.

But, Jim needs to accept doing so!

Andhrabharati commented 4 months ago

Case 642: ṭoṭa

I went through various lexicons for this again.

PWG & pwk just mention that ṭoṭa & ṭoṭī belong to the gaurādi gaṇa, and no meaning has been given.

However VCP & SKD have the words with the meanings alpa & hīna, while still citing them as belonging to the gaurādi gaṇa. They also give ḍoḍa and ḍoḍī, as the names of some plants. [Same is the case, in PWG & its followers.] [Note: ḍoḍa is not explicitly present as a HW, but is equated with kṣupaḍoḍamuṣṭi.] (This is especially for Scott.)

It is quite probable that they could've been variant names for ṭoṭa & ṭoṭī, though no explicit proof for the same could be got (in the little time that I had spent now).

So we could probably mark the body portion as v. l. for ḍoḍ° and close the case.

Are you happy now, Scott?

aumsanskrit commented 4 months ago

Thank you, Andhrabharati. Yes, I am happy now.

By the way, you have written as follows — "Note: ḍoḍa is not explicitly present as a HW, but is equated with kṣupaḍoḍamuṣṭi."

Please see the image below showing the Headword डोड in MW dictionary.

![Uploading DoDa.jpg…]()

aumsanskrit commented 4 months ago

Trying to upload डोड again (but of course you can just look it up in MW dictionary):

 DoDa

Andhrabharati commented 4 months ago

What I meant was, it is not in MW as a plant type!

aumsanskrit commented 4 months ago

I understand.

By the way, while we are waiting for Jim to make some comments, I have one question that I am certain you can answer:

Is there any difference between the two following diacritical markings for notating the anusvāra?

1) ṁ

2) ṃ

Or are they absolutely identical and interchangeable in all situtations?

Andhrabharati commented 4 months ago

Good question indeed!

The simple answer is "these two are NOT always interchangeable".

Though ISO ṁ and IAST ṃ are certainly interchangeable (both denoting the devanagari anusvāra) [however, it is a std. norm to use a single transliteration scheme, be it ISO or IAST, in a single work consistently throughout; mixing the two within a work (without explicit demarking) would lead to unnecessary confusion], there are many cases where devanagari ardhānusvāra is transliterated as ṁ in Roman letters (in earlier works), i.e. before the IAST came into vogue in 1895 [after the Geneva Congress in 1894].

So, it all depends on which text (print) we are referring to!

aumsanskrit commented 4 months ago

Thank you, Andhrabharati. I had been wondering about this for a few years, but too busy to investigate the answer.

funderburkjim commented 4 months ago

QuRQi—rAjoa -> QuRQi—rAja my typo corrected.

<s>testudo</s> -> <bio>testudo</bio> Hurray for turtles!

<ab>v.l.</ab> for <s>dow°</s> -> <ab>v.l.</ab> for <s>qoq°</s> (under 81295, 81295.1 ) print chg.

funderburkjim commented 4 months ago

My next task is to resolve the k1-k2 mismatches (ref: https://github.com/sanskrit-lexicon/csl-orig/issues/1638#issuecomment-2155866315).

I think this will finish these various backlog issues. However, there have been many discussions. @aumsanskrit and @Andhrabharati - please mention here other open questions that have solutions which I have yet not implemented.

Andhrabharati commented 4 months ago

No pending items in this issue, @funderburkjim !

However, I would just like to remind you of this task.

aumsanskrit commented 4 months ago

I did mention as follows, regarding the Headword “Khārdamāyana [ID=61878]”.

Andhrabharati has confirmed that the word, "kārdamāyana" exists as a variation, but my understanding is that a variation is NOT the same as "cf. kārd°". Therefore, I am proposing a "print change" as follows:

"cf. kārd°" should be changed to "v.l. kārd°", because MW dictionary does not include the Headword "kārdamāyana".

funderburkjim commented 4 months ago

made the change re Khārdamāyana.

Andhrabharati commented 4 months ago

@funderburkjim

After finishing the linking of Verz. d. Oxf. H. at PWG (and pwk), you can close this issue.