sanskrit-lexicon / INM

Index to names in the Mahabharata
0 stars 0 forks source link

INM widely spaced text #5

Closed funderburkjim closed 2 years ago

funderburkjim commented 2 years ago

This topic mentioned in comments on spaced text.

In PW and PWG, the markup {|X|} was altered by me to <is>X</is>. This markup converted in the html displays to CSS: <span style='letter-spacing:2px;'>X</span> (see basicdisplay.php - search for "is", currently at line 524).

According to the notes for PW, PWK in the meta2 file, this <is> markup was used for text that represents IAST spellings of Sanskrit words and that also appears in text with extra space between characters.

There is also a variant markup of the form <is n="1">X</is>, which was introduced by me for non-Sanskrit words. For example, Prākrit in PW under anyagata.

Thus, if we restore the lost {|X|}, we should use the <is>X</is> and <is n="1">X</is> forms.

According to all_xmltags.txt, no other dictionaries use the <is> tag, so it is 'safe' to use it for INM for this similar purpose as it was used in PW, PWG.

funderburkjim commented 2 years ago

The extract of these instances from an old INM version mentioned at this comment will be useful as a guide to where markup should be added. Note that this old version still has the AS (letter-number) representation of IAST.

I presume from comments in that issue that others think this 'spaced' markup should be restored in inm.txt ?

gasyoun commented 2 years ago

I presume from comments in that issue that others think this 'spaced' markup should be restored in inm.txt ?

Yes, restored.

Andhrabharati commented 2 years ago

pl. restore these @funderburkjim.

funderburkjim commented 2 years ago

This markup now restored in csl-orig/v02/inm.txt. Work steps are in spacedmarkup directory.

The displays for inm now show the spaced text, for example AdivaMSAvatAraRa.

Note 1: This markup has not been added into the revision of Andhrabharati's consolidated version (e.g., into the slp1 version inm_slp1_L2_02.txt or the equivalent devanagari version in inm_deva_L2_02.zip (See issue 4 comment).

Note 2: I decided not to use the <is n="1">X</is> variant mentioned in comment above. So all markup is of form <is>X</is>.

Note 3: An arcane minor point of interest is how the markup was handled when the text to be marked occurred at a line break. See the readme in spacedmarkup for elaboration.