sanskrit-lexicon / CORRECTIONS

Correction history for Cologne Sanskrit Lexicon
8 stars 5 forks source link

BUR Devanagari-IAST comparison #296

Closed funderburkjim closed 8 years ago

funderburkjim commented 8 years ago

The Burnouf dictionary also gives both a Devanagari and IAST form for the headwords. So it should be possible to do a comparison of these, as was done with BEN in #287, as a means to check both spellings.

funderburkjim commented 8 years ago

BUR uses 'w' in his IAST for the more usual 'v'; e.g.

image

Probably we should consider this a 'feature' of BUR, rather than a bug; and take this quirk into account when doing the Devanagari-IAST comparison. (i.e., not consider IAST 'hwal' a spelling error).

[When a comparison program is written, other peculiarities of IAST may emerge; and if so, should be similarly considered non-errors.]

Do others concur?

gasyoun commented 8 years ago

not consider IAST 'hwal' a spelling error

Fully agree.

funderburkjim commented 8 years ago

The preparatory work has been done, and is in this issue-296prep directory.

645 cases are identified (see hwchk_iast1.org, use raw text or Emacs to view).

The IAST appears in our digitization bur.txt in a variant of the AS (Anglicized Sanskrit) coding (letters and numbers, usually). Here are the main differences from usual AS coding conventions (The 'out' values are in SLP1 transliteration of Devanagari.)

<e><s>INIT</s><in>r2</in><out>f</out></e>
<e><s>INIT</s><in>ç</in><out>S</out></e>
<e><s>INIT</s><in>s2</in><out>z</out></e>
<e><s>INIT</s><in>x</in><out>kz</out></e>
<e><s>INIT</s><in>ao</in><out>O</out></e>
<e><s>INIT</s><in>ae</in><out>E</out></e>
<e><s>INIT</s><in>n4</in><out>M</out></e>
<e><s>INIT</s><in>ch</in><out>C</out></e>
<e><s>INIT</s><in>w</in><out>v</out></e>

Two other conversions, noticed during case examination:

<e><s>INIT</s><in>l2i</in><out>x</out></e>
<e><s>INIT</s><in>l2</in><out>L</out></e>

As time goes by, I'll examine and modify hwchk_iast1_edit.org for needed corrections, then harvest the results as standard form corrections. If anyone wants to do some of these, let me know so we can divide the work.

funderburkjim commented 8 years ago

All these cases have now been examined, and the corrections installed.

Here are some stats:

IAST-p 135   corrections to IAST which were judged to be due to an error in the printed text
DEVA-p 97   corrections to Devanagari
IAST-n 60   These were false positives of one kind or another
IAST-t 224    corrections to IAST which were judged to be due to typist error.
DEVA-t 129   typist errors in Devanagari

 TOTAL 645
gasyoun commented 8 years ago

consistently uses an 's' in IAST to represent the visarga. This appears to me to be an ambiguity in the IAST which the IAST-SLP1 transcoding cannot (nor should not be able to) resolve. Yet, it was not considered to be an error that should be changed.

Fully agree with your choice. An oddity, good that it's documented now.