sanskrit-lexicon / MWS

Monier Monier-Williams, Sir; A Sanskrit-English dictionary. Oxford, 1899
Other
7 stars 5 forks source link

RV references without the RV -- AND general<ls> cleanup #134

Closed funderburkjim closed 2 years ago

funderburkjim commented 2 years ago

Under https://github.com/sanskrit-lexicon/PWG/issues/23, a comment noticed an unmarked RV reference in MW:

and <ab>Prec.</ab> <s>cyozIQvam</s>, <ls>Pāṇ. viii, 3, 78</ls>; <ls>Kāś.</ls>) to move to and fro, shake about, 
<ls>RV. i, 167, 8</ls>;
<div n="to"/>to stir, move from one's place, go away, retire from (<ab>abl.</ab>), 
turn off; 
<ls>vi, 62, 7</ls>;   <<< ALSO RV. NOT MARKED
<div n="vp"/> <ls>x</ls>;   <<< NO IDEA what this 'x' is referring to!
<ls>BhP. ix, 14, 20</ls>;

Once noticed, we can change the markup : <ls n="RV.">vi, 62, 7</ls>.

But these are hard to find. I may have some software used in PW or PWG that would be applicable; Will look for it.

gasyoun commented 2 years ago

I may have some software used in PW or PWG that would be applicable; Will look for it.

Hard, but not impossible for the regex magician Jim, thanks!

Andhrabharati commented 2 years ago

He has already listed them in lsextract file as two types--

00239 NUMBER number (ls starts with number)

I find 240 of such in 223 lines!

12809 UNKNOWN unknown (ls is unknown)

I find 10356 of <ls>ib.; 766 of <ls>i[^b]; 65 of <ls>; 75 of <ls>,; 23 of <ls>;; and 1119 of <ls>[a-hj-z]. A difference of 400 (wrt Jim's count) is yet to be traced.

  1. Along with the 240 of IA numbered items above, the 27 of<ls>c, 41 of <ls>l, ~500 of <ls>x, ~500 of <ls>v and 766 of <ls>i could be clubbed with the prev. ls-entity.
  2. All the 10356 <ls>ib. entries could be equated to the prev. ls-entity.
funderburkjim commented 2 years ago

This phase of cleanup of MW ls references completed. About 2800 lines changed. This is the csl-orig commit.

For details, see the issue134 directory.

Changes in two parts:

There are still many improvements to be made. For instance

876 matches in 874 lines for "<ls[^<]* and"
Example: <ls>Mn. ix, 49 and 51.</ls>

Further work on such cleanup will be documented in another issue.

Andhrabharati commented 2 years ago

@funderburkjim

see abnormal_noref_RV.txt for cases that remain unresolved.

Under <L>62946<pc>344,3<k1>gadgadita <ls>PāṇŚ.</ls> (<ls>RV.</ls>), <ls>35.</ls> is the <ls>PāṇŚ. 35.</ls> in its <ls>RV.</ls> recension; it may be noted that the PāṇinīyaŚikṣā exists in many recensions!

image

The same applies to <L>65326<pc>356,1<k1>gItin, <L>70575<pc>381,3<k1>cakrAhva and <L>91127<pc>473,1<k1>dazwa cases as well.

gasyoun commented 2 years ago

No attempt yet to deal with all the ib.. There are pitfalls to programmatic identification of the appropriate 'preceding' ls abbreviation.

@Andhrabharati have you ever tried to?

Andhrabharati commented 2 years ago

My works are not "usable" (being in a diff. format altogether)!

funderburkjim commented 2 years ago

Possible markup change:

OLD
<ls>PāṇŚ.</ls> (<ls>RV.</ls>), <ls>35.</ls>
NEW
<ls>PāṇŚ. (RV.) 35.</ls>

And new literary source
PāṇŚ. (RV.)    == PāṇinīyaŚikṣā, Rg Veda recension [Cologne addition]

Possibly useful for a link: https://linguindic.com/data/browse/texts/16/

Only about 7 instances.