Closed funderburkjim closed 2 years ago
I may have some software used in PW or PWG that would be applicable; Will look for it.
Hard, but not impossible for the regex magician Jim, thanks!
He has already listed them in lsextract file as two types--
00239 NUMBER number (ls starts with number)
I find 240 of such in 223 lines!
12809 UNKNOWN unknown (ls is unknown)
I find 10356 of <ls>ib.
; 766 of <ls>i[^b]
; 65 of <ls>
; 75 of <ls>,
; 23 of <ls>;
; and 1119 of <ls>[a-hj-z]
.
A difference of 400 (wrt Jim's count) is yet to be traced.
<ls>c
, 41 of <ls>l
, ~500 of <ls>x
, ~500 of <ls>v
and 766 of <ls>i
could be clubbed with the prev. ls-entity.<ls>ib.
entries could be equated to the prev. ls-entity.This phase of cleanup of MW ls references completed. About 2800 lines changed. This is the csl-orig commit.
For details, see the issue134 directory.
Changes in two parts:
<ls n="RV.">vi, 62, 7</ls>;
type -- this is the 'cyu' example mentioned in the opening comment of this issue
<ls>X
where X is not a literary source reference abbreviation. 2500+ lines changed. The sequence of changes listed in the readme.txt and change_2.txt file.
<ls>ib.</ls>
. There are pitfalls to programmatic identification of the appropriate 'preceding' ls abbreviation. <ls n="Unknown">X</ls>
.There are still many improvements to be made. For instance
876 matches in 874 lines for "<ls[^<]* and"
Example: <ls>Mn. ix, 49 and 51.</ls>
Further work on such cleanup will be documented in another issue.
@funderburkjim
see abnormal_noref_RV.txt for cases that remain unresolved.
Under <L>62946<pc>344,3<k1>gadgadita
<ls>PāṇŚ.</ls> (<ls>RV.</ls>), <ls>35.</ls>
is the <ls>PāṇŚ. 35.</ls>
in its <ls>RV.</ls>
recension; it may be noted that the PāṇinīyaŚikṣā exists in many recensions!
The same applies to <L>65326<pc>356,1<k1>gItin
, <L>70575<pc>381,3<k1>cakrAhva
and <L>91127<pc>473,1<k1>dazwa
cases as well.
No attempt yet to deal with all the
ib. . There are pitfalls to programmatic identification of the appropriate 'preceding' ls abbreviation.
@Andhrabharati have you ever tried to?
My works are not "usable" (being in a diff. format altogether)!
Possible markup change:
OLD
<ls>PāṇŚ.</ls> (<ls>RV.</ls>), <ls>35.</ls>
NEW
<ls>PāṇŚ. (RV.) 35.</ls>
And new literary source
PāṇŚ. (RV.) == PāṇinīyaŚikṣā, Rg Veda recension [Cologne addition]
Possibly useful for a link: https://linguindic.com/data/browse/texts/16/
Only about 7 instances.
Under https://github.com/sanskrit-lexicon/PWG/issues/23, a comment noticed an unmarked RV reference in MW:
Once noticed, we can change the markup :
<ls n="RV.">vi, 62, 7</ls>
.But these are hard to find. I may have some software used in PW or PWG that would be applicable; Will look for it.