Closed Andhrabharati closed 2 years ago
number followed by letter:
0f
to be changed as of
0law-book
to be changed as law-book
3oth
to be changed as 30th
4<s1
to be changed as 4 <s1
4tb
to be changed as 4th
5<s1
to be changed as 5 <s1
5jewels
to be changed as 5 jewels
6o
to be changed as 60
11sth
to be changed as 11th
[2 places]ccl. 2.
to be changed as col. 2.
[2 places]p.802col.2
to be changed as p.802 col.2
p.1118col.1.
to be changed as p.1118 col.1.
p.1200col.3.
to be changed asp.1200 col.3.
need for uniformity in punctuation marking:
’.
63 instances and .’
207 instances’,
13628 instances and ,’
33 instances’;
782 instances and ;’
1 instance[1-3]\. <ab>sg
892 instances and[1-3] <ab>sg
12 instances[1-3]\. <ab>du
139 instances and [1-3] <ab>du
1 instance[1-3]\. <ab>pl
839 instances and [1-3] <ab>pl
23 instancesp\. [0-9]+
4632 instances and p\.[0-9]+
468 instances<ab>p\.</ab> [0-9]+
102 instances and <ab>p\.</ab>[0-9]+
1 instancecol\. [0-9]+
1565 instances and col\.[0-9]+
2 instances<ab>col\.</ab> [0-9]+
61 instances and <ab>col\.</ab>[0-9]+
604 instancesmissing space before '=':
(62609): some=
(95618): <s>-liliśire</s>)=
(100550): <ls>AV.</ls>= ‘diarrhoea’.
(195423): Kollam 336=
(511483): (also)= <s>bhū</s>
(686740): <ab>mfn.</ab>= <s>°vat</s>
(769991): <s>sana</s>=
Mis-matched pairing of [...] :
(38354): , <ls>Pāṇ.</ls>]
[<ls>Pāṇ.</ls>]
(85683): [<ls>L.</ls>
[<ls>L.</ls>]
(85686): [<ls>L.</ls>
[<ls>L.</ls>]
(85689): [<ls>L.</ls>
[<ls>L.</ls>]
(169602): <s>dhuni</s>, j
<s>dhuni</s>]
Mis-matched pairing of (...) :
(15609): <s>-devatā</s>)
<s>-devatā</s>
(15612): <s>-devatā</s>)
<s>-devatā</s>
(29427): ([<ls>MaitrS.</ls>; <ls>VS.</ls>]
[<ls>MaitrS.</ls>; <ls>VS.</ls>]
(29430): ([<ls>MaitrS.</ls>; <ls>VS.</ls>]
[<ls>MaitrS.</ls>; <ls>VS.</ls>]
(40536): <s>-pūrvaka</s>)
<s>-pūrvaka</s>
(40545): <s>-pūrvaka</s>)
<s>-pūrvaka</s>
(41022): [<ls>RV. x, 152, 2</ls>; <ls>AV.</ls> &c.]) ([<ls>ŚBr.</ls>]
[<ls>RV. x, 152, 2</ls>; <ls>AV.</ls> &c.] [<ls>ŚBr.</ls>]
(41025): [<ls>RV. x, 152, 2</ls>; <ls>AV.</ls> &c.]) ([<ls>ŚBr.</ls>]
[<ls>RV. x, 152, 2</ls>; <ls>AV.</ls> &c.] [<ls>ŚBr.</ls>]
(95025): <ab>B.</ab>)
(<ab>B.</ab>)
(100964): (<s>ās</s> ¦
¦
; this is a word ending, mostly being 'removed' throughout
(114152): <ls>Ragh.</ls>)
<ls>Ragh.</ls>
; print correction
(131776): (<ab>B.</ab>
(<ab>B.</ab>)
(275105): (? for <s>dhaṅka-m</s>
(? for <s>dhaṅka-m°</s>)
(290174): a) kind of drama
a kind of drama
(312310): (<ls n="MBh.">ii, 983</ls>-<ls n="MBh.">1203</ls>
(<ls n="MBh.">ii, 983</ls>-<ls n="MBh.">1203</ls>)
(378395): or <s>°lā<srs/>bda</s>
(or <s>°lā<srs/>bda</s>
(378395): (which begins on the 20th October, <ab>A.D.</ab> 879.
(which begins on the 20th October, <ab>A.D.</ab> 879.)
(378398): or <s>°lā<srs/>bda</s>
(or <s>°lā<srs/>bda</s>
(378398): (which begins on the 20th October, <ab>A.D.</ab> 879.
(which begins on the 20th October, <ab>A.D.</ab> 879.)
(378401): or <s>°lā<srs/>bda</s>
(or <s>°lā<srs/>bda</s>
(378401): (which begins on the 20th October, <ab>A.D.</ab> 879.
(which begins on the 20th October, <ab>A.D.</ab> 879.)
(436419): <ab>wk.</ab>)
<ab>wk.</ab>
(467381): <ab n="praise">pr°</ab>’
<ab n="praise">pr°</ab>’)
(544670): (<ls>HPariś.</ls>
(<ls>HPariś.</ls>)
(544673): <lex>n.</lex>)
<lex>n.</lex>
(690877): (<ab>Sch.</ab>
(<ab>Sch.</ab>)
(714228): <ab n="Germany">G°</ab>)
<ab n="Germany">G°</ab>
(821440): <ls>RV. x, 133</ls>
<ls>RV. x, 133</ls>)
(204651):
¦ inserted, interpolated, <ls>R. ii, <ab>ch.</ab> 96</ls> <ab>Sch.</ab>; <ls>Naiṣ. xxii, 48</ls> <ab>Sch.</ab><info lex="inh"/>
has become
¦ inserted, interpolated, ; -------------------------------------------------------
all the missed matter after the comma is to be filled up!
(516867):
<lex>f.</lex>
(A.)
has become
<lex>f.</lex>
(<ls>A.</ls> ;[Apte dictionary])
;[Apte dictionary]
to be removed which was my comment
(144769):
<ab>Gr.</ab> 969)
has become
<ls>Gr. 969</ls> )
space to be deleted before the closing brace.
Root symbol (√) and <s>
tag:
There are 4625 √ <s>
instances and 360 <s>√
instances.
Shouldn't all be with same sequence-- √ either preceding (outside) or following (inside) the <s>
tag?
As there are 2257 cases of type √ <hom>1.</hom> <s>
, we can conclude that it should always precede.
But then, there are 2 cases of </hom> √
to consider.
Capital letter following a small letter in a tagged entry, where it shouldn't be so:
(168816): <lex>f (A)n.</lex>
<lex>f (<s>ā</s>)n.</lex>
(265210): <i>jallālu 'ddIn</i>
<i>jallālu 'ddīn</i>
(269458): <etym>gIvēnu</etym>
<etym>gīvēnu</etym>
(300555): <s1>YamaYamī</s1>
<s1>Yama-Yamī</s1>
(303738): <s1>ŚrI</s1>
<s1>Śrī</s1>
(328281): <s1>LakṣmI</s1>
<s1>Lakṣmī</s1>
(328284): <s1>LakṣmI</s1>
<s1>Lakṣmī</s1>
(476884): <etym>fSu</etym>
<etym>fshu</etym>
(585671): <etym>rathaestA</etym>
<etym>rathaestā</etym>
(585677): <etym>rathaestA</etym>
<etym>rathaestā</etym>
(628613): <etym>virSús</etym>
<etym>virshùs</etym>
[Note. I had deleted all the slp1 strings in my file, for convenience sake.]
Number before a <s>
tag, either indicating a missing <hom>
tag or a typo:
[0-9] <s>
(60681): 1 <s>ali</s>
<hom>1.</hom> <s>ali</s>
(227757): 2 <s>gir</s>
<hom>2.</hom> <s>gir</s>
(227757): 2 <s>gīrṇá</s>
<hom>2.</hom> <s>gīrṇá</s>
(249605): 1. 2. 3 <s>cit</s>
<hom>1.</hom> <hom>2.</hom> >hom>3.</hom> <s>cit</s>
(352745): 1. and 2 <s>navya</s>
<hom>1.</hom> and <hom>2.</hom> <s>navya</s>
(353175): 2 <s>-áka</s>
<hom>2.</hom> <s>-áka</s>
(441714): 1 <s>mi</s>
<hom>1.</hom> <s>mi</s>
(469506): 1 <s>prā<srs/>ṅ-nyāya</s>
<hom>1.</hom> <s>prā<srs/>ṅ-nyāya</s>
-------------------
[0-9], <s>
(65237): 4, <s>liyat</s>
<s>-līyat</s>
(205493): 2, <s>kṣúdh</s>
<hom>2.</hom> <s>kṣúdh</s>
(357525): 1, <s>náva</s>
<hom>1.</hom> <s>náva</s>
(415453): 2, <s>pat</s>
2, <ls>Pat.</ls>
(419111): 2, <s>as</s>
<hom>2.</hom> <s>as</s>
(425563): 1. and 2, <s>puri</s>
<hom>1.</hom> and <hom>2.</hom> <s>puri</s>
(464659): 1, <s>si</s>
<hom>1.</hom> <s>si</s>
(475781): 5, <s>i</s>
<hom>5.</hom> <s>i</s>
(507691): 1, <s>bhī</s>
<hom>1.</hom> <s>bhī</s>
(555959): 1, <s>mura</s>
<hom>1.</hom> <s>mura</s>
(665267): 1, <s>ru</s>
<hom>1.</hom> <s>ru</s>
(708977): 2, <s>śad</s>
<hom>2.</hom> <s>śad</s>
(752772): 1. 2, <s>vṛ</s>
<hom>1.</hom> <hom>2.</hom> <s>vṛ</s>
(753242): 2, <s>saṃ-vedya</s>
<hom>2.</hom> <s>saṃ-vedya</s>
(791067): 7, <s>sa</s>
<hom>7.</hom> <s>sa</s>
(801680): 7, <s>sa</s>
<hom>7.</hom> <s>sa</s>
(803606): 1, <s>sam-udra</s>
<hom>1.</hom> <s>sam-udra</s>
-------------------
[0-9]\. <s>
(7485): 1. <s>ajá</s>
<hom>1.</hom> <s>ajá</s>
(7485): 1. <s>ajana</s>
<hom>1.</hom> <s>ajana</s>
(12602): 1. <s>kṛ</s>
<hom>1.</hom> <s>kṛ</s>
(13338): 3. <s>á-diti</s> ``<hom>3.</hom> <s>á-diti</s>
... LIST CONTINUES (~600 lines to check manually)
-------------------
(35207): 1: <s>kṛ</s>
<hom>1.</hom> <s>kṛ</s>
About 13000+ lines were altered.
The work was done in the issue137 directory.
I aimed to include all the items mentioned by Andhrabharati. In addition, several additional changes were made with the objective of bringing certain details of the digitization into better conformity with the printed text. No doubt there are other similar changes that will be made as such differences are noticed; let these be discussed in future issues.
This is one instance where I think there is a good reason for the digitization to vary from the printed text. The printed text invariably puts punctuation (comma, period, semicolon) BEFORE (inside) the closing quote of quoted text. But I have changed to uniformly put the punctuation AFTER (outside) the closing quote: For instance:
OLD (agrees with print):
<s>aMSu—Dara</s> ¦ <lex>m.</lex> ‘bearer of rays,’ the sun, <ls>L.</ls>
NEW
<s>aMSu—Dara</s> ¦ <lex>m.</lex> ‘bearer of rays’, the sun, <ls>L.</ls>
The reason for the change is that the ending comma, etc. is not part of the quote, but rather separates the quote from other semantic chunks. Note: in case of period, there are a very small number of cases where an ending period IS part of the quoted text, and has thus been left inside, for examples:
[The title <s>AcArya</s> affixed to names of learned men is rather like our ‘Dr.’; <ab>e.g.</ab> <s>rAGavA<srs/>cArya</s>, &c.]
[<ab>fr.</ab> √ <hom>1.</hom> <s>kf</s>, ‘= <s>kurvARa</s>, <s>kartf</s>, &c.’, <ls>Sāy.</ls>]
This is consistent with the slp1 commit of csl-orig/v02/mw.txt.
Before closing this issue, I'll wait a couple of days to deal with errors or omissions in the way I handled the mw cleanup items of this issue.
My intention then is to take a break from mw changes, and return attention to the ongoing ls-cleanup of PW and PWG.
(- - u u - -) to be changed as (- - ˘ ˘ - -)
@Andhrabharati sure?
The printed text invariably puts punctuation (comma, period, semicolon) BEFORE (inside) the closing quote of quoted text. But I have changed to uniformly put the punctuation AFTER (outside) the closing quote
@funderburkjim can we document it in a .txt readme, not to forget where there is a such CHANGE by intention?
Discovered two errors, and corrected. See 'correct two errors. modify change_5.txt' section of issue137 readme.txt for details.
Also revised iast version: temp_mw_issue137_iast_rev.zip
Can we document the intentional change?
A note was made in mw_printchange.txt file of csl-corrections repository: https://github.com/sanskrit-lexicon/csl-corrections/commit/b7ccd24d1988e8cda9105adcd18eda3d1c9ba1b0
(- - u u - -) to be changed as (- - ˘ ˘ - -)
This occurs under <L>82334<pc>435,3<k1>tanumaDyA
This note from readme.txt of issue137:
NOTE:
1. (- - u u - -) to be changed as (- - ˘ ˘ - -)
Instead change to (¯ ¯ ˘ ˘ ¯ ¯), as used 48 times elsewhere for meter
i.e., I used the unicode macron (\u00af) for long.
@funderburkjim
Found some interesting points reg. AND/OR grouping elements!
There are 6 single <L>
elements in AND groups
(36310, 37336, 45103, 59037, 72383, 80300)
and a whopping 100+ single <L>
elements in OR groups!
(5295.1, 5963, 6230, 9218, 13040, 13046, 13293, 16421, 16441, 19425, 21437, 21529, 29168, 29633, 29831, 46491, 46738, 49740, 52477, 53475, 57080, 58684.12, 62547, 71080, 91798, 95320, 96358, 97425, 98426, 99465, 99624, 110675, 115003, 115399, 116989, 120532, 129300, 135725, 139457, 141203, 144737, 145504, 148500, 148573, 154262, 157214, 158837, 159177, 166051, 167026, 169540, 180428, 183881, 186088, 186226, 186289, 186645, 188644, 188650, 188663, 191027, 192063, 193598, 194958, 195242, 195996, 196890, 200007, 203319, 203480, 205118, 205142, 205180, 205211, 206064, 206364, 206466, 208147, 210505, 210902, 213989, 216907, 219834, 220857, 223679, 231039, 231079, 237161, 239670, 239811, 239992, 245058, 246345, 247829, 247858, 247867, 248288, 250081, 250879, 252557, 252708, 256290, 259869, 260454, 261644, 262061)
Noticed that these are mostly with accent differences or hyphenation differences.
Would you like to correct this point, as you feel appropriate?
Also there are 6 <L>
entries whose body portion is ending as 'or'
(1962, 9981, 96042, 156088, 169950, 171580)
and one entry with body-ending as 'of' (which is a typo for 'or')
(4074)
And there is one entry with body-ending as 'and' (95389)
These should be combined with the following entries appropriately and then to be made as "proper" grouped entries.
single L groups
It seems reasonable to retain such markup, as it identifies the headwords which have more than one accent variant.
Do you have a better way to do this markup?
There are 8 <L>
entries ending with <ab>w.r.</ab> for
, which need to be combined with the next entries properly.
(92603, 95205, 98434, 104490, 107642, 114508, 125521, 131402)
Have handled the additional 16 'L' references mentioned in two previous comments. See 'temp_change_or1' and 'temp_change_or2' in readme.
Handled the w.r. cases with a new info attribute: <info orwr="..."/>.
@Andhrabharati Ok to close this issue?
@funderburkjim
There are about 2000 more w.r. instances in the text; but it may be alright to close this issue for now (with a final update of iast).
This issue appears to have tackled quite many points at once.
We can come back to MW sometime later, after finishing the long-pending ls-cleaning in the PWG family (PWG, pwk, pwkvn and SCH). [I am thinking of giving out my 'resolutions' for all the 'unidentified' entities in these this time.]
Speaking of ls-resolutions, you are yet to 'finally' correct the RLM (in the MW) as mentioned recently, as at https://github.com/sanskrit-lexicon/MWS/issues/135#issuecomment-1208133347
Probably you may also consider changing the remaining [noticed that the count has now come down to 300+ from the earlier 800+] ṉ to ṃ; 3 of which are in the ls strings as napuṉs.
and the remaining are in the main text at the s1, ns or ab (expansion) strings.
[The single Zend etym-string aiwyāoṉhana
at <L>19258.1
may also be changed as above, as this language is considered as a sister language to Sanskrit.]
Noticed ~300 â instances, which should've been à within the 100+ <s>
strings and in the corresp. meta lines.
Talking of the caret instances above, got reminded of another issue (#107) that might also be considered in the MW spree now.
There is one instance (line 295319) where √ is not followed by a space; and one instance (line 365916) where div n="to"/>
is not preceded by the <
.
tooltip altered. Good find! (headword kAkaciYcika).
About 5000 lines changed.
These address the points above starting at https://github.com/sanskrit-lexicon/MWS/issues/137#issuecomment-1235047261.
More detailed notes are found in the readme, starting at change_6: Extended Ascii changes in ls
.
changes â -> ā, ê -> e, î -> ī, ô -> o, û -> ū, ṉ -> ṃ
in 3 places:
<ls>Divyâd.</ls>
-> <ls>Divyād.</ls>
<ls>
in mw.txt<s1 slp1="mAMsarohiRI">Māṉsarohiṇī</s1>
-> <s1 slp1="mAMsarohiRI">Māṃsarohiṇī</s1>
Note that no changes were made in the <etym>
elements. In particular,
aiwyāoṉhana at <L>19258.1
was not changed.
Noticed 300 â instances, which should've been à within the 100+ \
strings and in the corresp. meta lines.
I'm not sure what was intended here --
Here's latest iast version of mw digitization: temp_mw_issue137_iast_rev2.zip
Recently noticed many (800+) instances of ([X])
. I think @Andhrabharati previously also noticed these as needing change. From a small sample examination of print, I concluded these should be changed to [X]
, See change_8.txt for these changes.
This ends my remarks regarding change_6 through change_8.
Glad that you are considering my above suggestions, @funderburkjim !
Looking at the mwauth corrections-
I think a good revision/re-look/vetting of all the ls-expansions is required sometime sooner. [I was just looking at the ? marked (or unlisted) ls-entries thus far in MW.]
As a glaring example, the Gaṇaratnāv. is not Gaṇaratnamahodadhi, but is Gaṇaratnāvalī!!
It is the "collection of Gaṇas to Pāṇini's gr. based on Gaṇaratnamahodadhi & other gr. & lex. works; composed in 1874 A. D. by Yajñeśvara Bhaṭṭa".
Should this be done now, or after completing the PWG ls-exercise? [Anyway, this should be dealt in another issue, but not here.]
I think @Andhrabharati previously also noticed these as needing change.
Yes, I had mentioned this earlier.
Noticed 300 â instances, which should've been à within the 100+
<s>
strings and in the corresp. meta lines.I'm not sure what was intended here --
Here's latest iast version of mw digitization: temp_mw_issue137_iast_rev2.zip
Pl. see under <L>550.2
in mw.txt (as example)
metaline <k2>
akzitavya^
headline <s>
akzitavya^
and the corresp. iast text
metaline <k2>
akṣitavyâ
headline <s>
akṣitavyâ
Here is the scan of the portion [now I have a very good scan of MW]
Do I make sense now, @funderburkjim ? [There are 108 such places in metalines.]
akzitavya^
OK, now I see your concern.
Using mw.txt (the slp1 version), my count is slightly different:
114 matches for "<k2>.*?\^"
In slp1, the spelling uses the ^
character as an accent. Which accent?
It is svarita: See frontmatter
Next, we have the question of how to represent, in displays, this svarita accent with diacritics. In the printed text, the svarita accent is represented by a backward 'grave' accent, and 'udAtta' by a forward 'acute accent. There is no anudAtta mentioned or used.
The Cologne displays use a representation where svarita is represented by circumflex diacritic, udAtta by acute accent diacritic, and anudAtta by grave accent diacritic.
The iast version of MW which you are referring to also used the same 'Cologne' representation.
Thus, there is nothing that requires changing. Just remember that in IAST displays of MW, a Sanskrit word with circumflex-diacritic will appear in the MW printed text with a 'grave'-like diacritic.
<srs/>
and svaritaIn the printed text of MW a Sanskrit word often appears with a 'circumflex' diacritic. But this is NOT an accent. It is a special convention (described on the same front matter page mentioned above) for representing vowel-sandhi. See for instance, aMSAMSa
The representation in the Cologne digitization uses the empty xml tag <srs/>
:
<s>aMSA<srs/>MSa</s>
Although MW describes 4 types of circumflex (representing short+short, short + long, etc.), the Cologne digitization does not distinguish among these types.
I have encountered a few cases where a vowel was coded with <srs/>
but should have
been coded with svarita. It seems likely that there are other such errors in the digitization.
good revision/re-look/vetting of all the ls-expansions
Definitely agree. You are the best person to do this. You could edit the file tooltip.txt and give me the resulting edited file for installation. Agree best to make another issue devoted to discussions arising during the review.
Thus, there is nothing that requires changing.
I would say otherwise-- There definitely is a need to do something here!
One year ago, (April 2021) while I was at MW work (for Cologne), these 100+ places were all properly converted/rendered as à in the metalines and the resp. headlines, in the IAST file you gave, as also the whole lot (127k) of other à throughout the text, as per the print matter. [The file is dated 4th April 2021] https://github.com/sanskrit-lexicon/MWS/issues/104#issuecomment-817359904
To make you see the difference more clearly, I am giving two examples now (comparing MW and PWG/pwk)--
vs. PWG & pwk
vs. pwkvn
And having à at these places makes the MW data tally with the original (sources) PWG/pwk data.
We don't have to reiterate that much of MW content is based on PWG family data, and they should be rendered in a similar fashion. [There should not be any second thought on this.]
As a glaring example, the Gaṇaratnāv. is not Gaṇaratnamahodadhi, but is Gaṇaratnāvalī!!
It is the "collection of Gaṇas to Pāṇini's gr. based on Gaṇaratnamahodadhi & other gr. & lex. works; composed in 1874 A. D. by Yajñeśvara Bhaṭṭa".
Just for info-- This work got printed after 100 years, in 1986.
Here is the title page and the list of works referred/cited therein--
[Probably @gasyoun might be interested to make a note of this info.]
@funderburkjim
Let's close this misc. corrections issue, with three more small corrections-
<div n="to"/><ab>[A-Z]
(1100+ instances) as
<div n="vp"/><ab>[A-Z]
(2000+ instances presently),
as all these denote vp type entities.
¦ ,
(19 instances) as
, ¦
(5700+ instances presently)
... [three dots] (13 instances) as … [horiz. ellipsis] (no instance as of now)
[We can handle the remaining misc. corrections in another issue sometime later.]
Note IAST output for akzitavya^ in pwkvn:
And in mw:
Note the accent representation is the same in IAST.
For pwkvn:
For MW:
Note the Devanagari representation for svarita accent DIFFERS in MW and in pwkvn.
We have CHOSEN to make the Devanagari accent representation in PW, PWK, PWG consistent with the printed form of PWG, etc. Thus, the little vertical line over the vowel is used to represent svarita accent in PW, etc. (Similarly, udAtta is the little superscript devanagari 'u' in PW, etc.)
If you are wanting to compare MW with PW in terms of accents, then you should use either the slp1 representation or the IAST representation.
I still say no change is warranted at this time. If (as I suspect) you still disagree, I suggest you open a new issue devoted to this subject.
I doubt if you would be correcting these under a new issue, when not convinced about the point here itself; so I do not want to go that way.
If you don't like to bring these 100+ cases (â) within MW in line with the rest of 127k+ cases (à), which are all à in the print, you're the final judge as far as cologne data is concerned.
Thus, I leave the matter for now.
the rest of 127k+ cases (à), which are all à in the print,
126895 matches in 118274 lines for "a/" These are the instances (according to Cologne digitization) of the short vowel 'a' with udAtta accent. These are represented in print with an acute accent (e.g., under headword 'a'):
And, with output=iast in a Cologne display, they appear as a-with-acute-accent, á, not à
The a with grave accent (à) would be the Cologne iast representation of "a with anudAtta accent" (slp1 a\
), - there are none of these in MW.
Although I don't feel comfortable with changing the representation of svarita accent in Cologne mw displays, I'm not sure my view should prevail. I've opened another issue so the question of accent representation (especially in mw and in the PW family of dictionaries) will remain 'open' for some future consideration.
These mainly from above https://github.com/sanskrit-lexicon/MWS/issues/137#issuecomment-1236366820.
Also corrected several 'madA' entries to 'mada' -- some 'sub-entries' of 'mada' were incorrectly interpreted as feminine.
The details are in change_9.txt file of issue137 directory, and also mentioned in the readme at 'change_9' and following.
Here is the latest iast version of mw: temp_mw_issue137_iast_rev3.zip
Many varied improvements now made to the mw digitization and markup. Thanks to @Andhrabharati for his continued 'fresh look' at mw. Now closing the issue.
the rest of 127k+ cases (à), which are all à in the print,
It is my grosss mistake, using a wrong character at this.
This is iast version of mw digitization, with accent revised (so svarita accent = grave accent diacritic). For discussion, refer #140.
Fantastic; now the CDSL MW Roman text matches with the print.
Thanks a lot for relieving my worry, @funderburkjim !
As MW does not mark the Devanagari accents in the book, I am not that bothered about them in MW display and would leave the matter to the discretion of Jim, whether to match MW with PWG family or not.
However my final comment on the matter is to ask the team (@funderburkjim & @gasyoun) to just check the MW RV citations once with the corresponding linked RV text (courtesy: Marcis) and see if they notice any differences in Devanagari accents, and then compare the PWG RV citations with the RV links thereupon. [Probably, my point would be appreciated then.]
quote marks:
'e<srs/>hi mā yāsīr!'
to be changed as‘e<srs/>hi mā yāsīr!’
the esoteric
to be changed as‘the esoteric’
first father
' to be changed as‘first father’
‘perhaps it is and is not and is not expressible in words'
to be changed as‘perhaps it is and is not and is not expressible in words’
‘a ‘goat’, a derivation in the sense of, goat's flesh’
to be changed as‘a goat’, a derivation in the sense of ‘goat's flesh’
‘the ‘<s1>Śabara</s1>s’ food’
to be changed as‘the <s1>Śabara</s1>s' food’
apostrophe:
<s>-gate' hani</s>
to be changed as<s>-gate 'hani</s>
60 years ' cycle
to be changed as60 years' cycle
multiplication mark:
[0-9]x [0-9]
and 121 instances of[0-9] x [0-9]
could be changed as the multiplication mark × (U+00D7).prosody marking:
(- - u u - -)
to be changed as(- - ˘ ˘ - -)
miscellaneous: