Closed Andhrabharati closed 2 years ago
removing the space between 8 and 3 in the mw.txt
Interesting, never noticed before.
@Andhrabharati If you have a list of these, please provide, or else provide the search regex(es) you use. In a first look, I'm only finding 10 or so similar to your example above.
10 matches for "RV. [xvi]+, [0-9]+ [0-9]+, [0-9]" in buffer: mw.txt
I looked for "space between digits" [0-9] [0-9]
, not just for RV link cases.
Here is the extracted search result- space between digits.txt
Incidentally there are some <pc>
lines as well in this!
Also there are quite many cases where a number is outside the <ls>...</ls>
tag, which needs to be within the tag.
I used the regex </ls>[;\.,:] [0-9]
to get them.
@funderburkjim
just checked that PWG also has about 90 cases of "space between digits".
@Andhrabharati Thanks for alerting me of these problems. Will attend to them.
Empty source, first tima I see such an effect.
@funderburkjim can't figure out what's wrong here
He seems to have looked ONLY for the number pattern "Roman, IA, IA".
If the punctuation mark or space is different in that "block", he is not taking it as the 'Rigveda link'.
The current markup and link logic works for verses only; e.g. <ls>RV. x, 10, 5
for verse 5 of hymn x,10.
The yama example, by contrast, should be interpreted as a reference to two hymns (Rv. x 10 AND RV x 14), with no verse specified.
the supposed author of <ls>RV. x, 10; 14</ls>, of a hymn to <s1 slp1="vizRu">Viṣṇu</s1> and of a law-book;
Perhaps the display program (basicadjust.php) can be extended to generate links for examples like <ls>RV. x, 10</ls>
.
The other aspect of this yama example is that two references implied, and that the semicolon (the semicolon between 10 and 14) is used, in MW, to separate the two references.
A search for semicolons within RV references in mw.txt yields:
623 matches for "<ls>RV\. [xiv]+,[^<]*;" in buffer: mw.txt
All of these need to be examined and recoded where possible so that multiple links will be available. For instance, changes such as the following are desireable:
OLD
<ls>RV. i, 139, 1; iv, 44, 5.</ls>
NEW
<ls>RV. i, 139, 1</ls>; <ls n="RV">iv, 44, 5.</ls>
Work will be carried out with an aim to improve the markup and display in these dimensions.
Work will be carried out with an aim to improve the markup and display in these dimensions.
I give you my thanks.
623
Is a lot and not at the same time. I see them a lot!
The RV ls markup improvements mentioned above have been completed in MW. The work is done in mwauthorities/ls/20220628-rv.
Most of the changes are as predicted in the above comment. But several were typos with errors in spacing or punctuation. And a small number of errors involved homonym markup. For instance:
; <L>100473<pc>512,3<k1>Darmakft
; <ls>RV. viii.87, 1.2.</ls>
; <ls>RV. viii.87, 1.2.</ls> <<< That 2. should be the homonym number of 'next' entry
338230 old <hom>3.</hom> <s>Da/rma</s> ¦ in <ab>comp.</ab> for <s>°man</s> <ab>q.v.</ab> 2. <<< DROP the 2.
338230 new <hom>3.</hom> <s>Da/rma</s> ¦ in <ab>comp.</ab> for <s>°man</s> <ab>q.v.</ab>
; CHANGE the <h> value in next entry
338232 old <L>100473<pc>512,3<k1>Darmakft<k2>Da/rma—kft<h>b<e>3
338232 new <L>100473<pc>512,3<k1>Darmakft<k2>Da/rma—kft<h>2<e>3
; and similarly remark as hom 2.
338233 old <s>Da/rma—kft</s> <hom>b</hom> ¦ <lex>m.</lex> maintainer of order
(<s1 slp1="indra">Indra</s1>), <ls>RV. viii.87, 1.2.</ls><info lex="m"/> <<<< ALSO Drop this 2.
;
338233 new <hom>2.</hom> <s>Da/rma—kft</s> ¦ <lex>m.</lex> maintainer of order (<s1 slp1="indra">Indra</s1>), <ls>RV. viii, 87, 1.</ls><info lex="m"/>
; and do similar change for Darmavat: b change to 2
338235 old <L>100474<pc>512,3<k1>Darmavat<k2>Da/rma—vat<h>b<e>3
338235 new <L>100474<pc>512,3<k1>Darmavat<k2>Da/rma—vat<h>2<e>3
; change hom markup
338236 old <s>Da/rma—vat</s> <hom>b</hom> ¦ (<s>Da/rma</s>) <lex>mfn.</lex> accompanied by <s1 slp1="Darman">Dharman</s1> or the law (<s1 slp1="aSvin">Aśvin</s1>s), <ls>viii, 35, 13.</ls><info lex="m:f:n"/>
338236 new <hom>2.</hom> <s>Da/rma—vat</s> (<s>Da/rma</s>)
<lex>mfn.</lex> accompanied by <s1 slp1="Darman">Dharman</s1> or the law (<s1 slp1="aSvin">Aśvin</s1>s), <ls n="RV.">viii, 35, 13.</ls><info lex="m:f:n"/>
;; other homonyms of Darmakft and Darmavat
336540 old <L>99961<pc>510,3<k1>Darmakft<k2>Da/rma—kft<h>a<e>3
336540 new <L>99961<pc>510,3<k1>Darmakft<k2>Da/rma—kft<h>1<e>3
;
336541 old <s>Da/rma—kft</s> <hom>a</hom> ¦ <lex>mfn.</lex>
(2. See under 3. <s>Darma</s>) doing one's duty, virtuous, <ls>MBh.</ls><info lex="m:f:n"/>
336541 new <hom>1.</hom> <s>Da/rma—kft</s> ¦ <lex>mfn.</lex>
(<hom>2.</hom> See under <hom>3.</hom> <s>Darma</s>) doing one's duty, virtuous,
<ls>MBh.</ls><info lex="m:f:n"/>
;
337425 old <L>100234<pc>511,3<k1>Darmavat<k2>Da/rma—vat<h>a<e>3
337425 new <L>100234<pc>511,3<k1>Darmavat<k2>Da/rma—vat<h>1<e>3
;
337426 old <s>Da/rma—vat</s> <hom>a</hom> ¦ <lex>mfn.</lex> (2. See under 3.
<s>Darma</s>) virtuous, pious, just, <ls>L.</ls><info lex="m:f:n"/>
337426 new <hom>1.</hom> <s>Da/rma—vat</s> ¦ <lex>mfn.</lex> (<hom>2.</hom>
See under <hom>3.</hom> <s>Darma</s>) virtuous, pious, just, <ls>L.</ls><info lex="m:f:n"/>
There are many mentions of hymns, with no verse specified, such as
<ls>RV. viii, 13</ls>
in
356994 new <s>nA/rada</s> ¦ <lex>m.</lex> or <s>nArada/</s> <ab>N.</ab> of a
<s1 slp1="fzi">Ṛṣi</s1> (a <s1 slp1="kARva">Kāṇva</s1> or
<s1 slp1="kASyapa">Kāśyapa</s1>,
author of <ls>RV. viii, 13</ls>; <ls n="RV.">ix, 104</ls>; <ls n="RV. ix,">105</ls>;
<ls>Anukr.</ls>;
The basicadjust.php component of the displays is now adjusted so that this 2-parameter reference generates a link to first verse of the hymn.
Here is an example of a likely markup error, just noticed by accident.
Under pragATa
The markup is
<ls>RV. viii, 1, 2</ls>;
<ls n="RV. viii, 1,">10</ls>;
<ls n="RV. viii, 1,">48</ls>;
<ls n="RV. viii, 1,">51</ls>-
<ls n="RV. viii, 1,">54</ls>
The markup looks consistent with the printed text, but it can't be right, since there is no verse 48 (or 51 or 54) in hymn 'viii, 1'. Maybe the markup should be hymns 1, 2, 10, 48, 51, and 54 of mandala viii ?
<ls>RV. viii, 1</ls>, <ls n="RV. viii,">2</ls>;
<ls n="RV. viii,">10</ls>;
<ls n="RV. viii,">48</ls>;
<ls n="RV. viii,">51</ls>-
<ls n="RV. viii,">54</ls>
No doubt there are other similar problematic markups to identify and alter.
Other ls abbreviations in MW with link targets should be reviewed in a manner similar to the above review of RV link. Such as AV., P.,
I think the spacing issues should have been handled. The specific cases
RV. i, 37, i 2
was corrected to RV. i, 37, 12
(under 'cyu' mentioned here).The basicadjust.php component of the displays is now adjusted so that this 2-parameter reference generates a link to first verse of the hymn.
Hurray! A badly needed one around all the dictionaries and targets available.
Quite a few of such RV links to the Marcis's version (which is presently being used) would not be helpful, as those links do not give any clue about the meaning/intent in the MW.
All such should be linked to some other source, as I had proposed elsewhere recently.
All such should be linked to some other source, as I had proposed elsewhere recently.
Did not get why.
would not be helpful, as those links do not give any clue about the meaning/intent in the MW.
What do you mean?
@funderburkjim
See the entry "akratu", as an example-
The link is going to RV.x.8,3, which no way is connected to akratu word. The link should go to RV.x,83,5 instead; by removing the space between 8 and 3 in the mw.txt
Noticed ~100 such cases, that need correction in the digitisation.