Closed funderburkjim closed 6 years ago
The text always uses Latin alphabet with diacritics for Sanskrit words. Generally, the conventions of the text agree with modern IAST, but with the differences:
There is some incompleteness in the conversiion of 'sh' to ṣ. This conversion must be restricted to Sanskrit words, to avoid undesired conversions in English words such as 'should. For the 'sh' conversion, the following assumptions were used:
I'm sure there are some 'sh' conversions in Sanskrit words which are missed, (such as words or abbreviations which are not in italics and don't have a diacritic).
There are a few (40) cases where a vowel (with or without macron) also has a breve diacritic.
The digitization includes not only the main section of entries, but apparently all of the text. There are the following sections:
; TITLE
; FOREWARD
; PREFACE
; ABBREVIATIONS
; CONCORDANCE (33 pages)
; ENTRIES about 13000 headwords
; ADDITIONS AND CORRECTIONS (18 pages)
; POSTSCRIPT
Since all of the non-entry sections are digitized (part of inm.txt), it would be feasible to include them in the Front matter section .
There are digitized sections on abbreviations in the preface. These could provide the basis for <ab>
markup that would facilitate tooltips for users.
There are at least two possible sources of additional headwords.
<div n="HI">
This markup appears 22 times within entries. For instance under headwords DanadA:
<L>3353<pc>240-1<k1>DanadA<k2>DanadA
{@Dhanadā,@}¦ a mātṛ. § 615{%u%} (Skanda): IX, {@46<lang n="greek"></lang>,@} 2631.
<div n="HI">{@Dhanadeśvara, Dhanādhigoptṛ, Dhanādhipa,@}
<div n="lb">{@Dhanādhipati@}¦ = Kubera, q.v.
<LEND>
It appears that this is a typographically abbreviated form of four headwords. If these were recoded somehow as separate entries, then about 80-100 additional headwords would be added.
In the addtions and corrections sections, the first shorter part pertains to the Concordance, and the second longer part pertains to the index (i.e. to what we have coded as headwords). The formatting of this second part would make it possible to add as new headwords all the entries, whether additions or corrections. There are about 950 such entry-like sections.
aBiBU original entry:
aBiBU entry correction
aBiprAya -- does not appear as headword in main index, but does appear in the additions and correctinos:
<div n="X">
This markup can have X as
<F>
Indicates footnotes. about 30 instances. Recoded in the style adopted with KRM (#200).
<sup>
This is used for superscript text. General functions are:
<lang n="greek"></lang>
Many (9600) instances. However, at least some of these are one or two letters, used for some kind of indexing, rather than Greek words; here are two examples from first page.
<C n="N">
This markup occurs in 200 lines, and indicates columns in a complex tabular arrangment of text, such as genealogical relationships. For instance:
We currently have only a crude representation of this:
The converted form has now been installed at Cologne.
This issue is for comments regarding the conversion of the Cologne digitization inm.txt of the work
Index to the Names in the Mahabharata
.