sanskrit-lexicon / COLOGNE

Development of http://www.sanskrit-lexicon.uni-koeln.de/
18 stars 3 forks source link

jihvamuliya and upadhmaniya coding and display #59

Open funderburkjim opened 9 years ago

funderburkjim commented 9 years ago

In a few dictionaries, Sampada has found that missing data occurred where two visarga variants, jihvāmūlīya and upadhmānīya are being referenced. See, for example under SikzA in SKD.

This note is just to mention this oddity, and to indicate how these characters are coded, and how they are displayed. Also, to note some TODO items.

A reference for this is LIES (Linguistic Issues in Encoding Sanskrit) by Peter Scharf and Malcolm Hyman, downloadable from the publications section of Peter's web site at http://sanskritlibrary.org/tomcat/sl/-/pub/.

Another reference is http://en.wikipedia.org/wiki/Vedic_Sanskrit. Also, see http://unicode.org/charts/PDF/U1CD0.pdf.

Here's the coding:

variant SLP1 IAST Devanagari Devanagari Used
jihvāmūlīya Z ẖ (\u1e96) VEDIC SIGN JIHVAMULIYA (U+1CF5) 'VEDIC SIGN ARDHAVISARGA' (U+1CF2)
upadhmānīya V ḫ (\u1e2b) VEDIC SIGN UPADHMANIYA (U+1CF6) 'VEDIC SIGN ARDHAVISARGA' (U+1CF2)

The Devanagari is not shown, since only some fonts (siddhanta and prajna) will display any of these. The reason for the 2nd Devanagari column is that this 1CF2 unicode character displays well with siddhanta, and, according to Peter, is often used for the display of both variants. Here is an image of how it looks, which Peter describes as a pair of candras, one smiling above one frowning. image

TODO:

  1. Make sure the transcoder files slp1_deva.xml and slp1_roman.xml code the display of Z and V as described above.
  2. Change the displays for the dictionaries where these Z and V occur to use siddhanta web font for Devanagari output, so the characters will be displayed as well as possible.
gasyoun commented 9 years ago

And it's totally left out in SLP1 as now, or what? In the current Cologne implementation.

funderburkjim commented 9 years ago

dictionaries and words with Z,V

Dictionary word line in X.txt
AP aGoza 1762
AP ayoga 19395
AP arDa 20956
AP90 aGoza 3633
AP90 ayoga 27815
MW72 arDa 19829
MW72 candra 72006
SKD SikzA 411103,411246
funderburkjim commented 9 years ago

An alternate IAST Unicode coding of Z and V (not currently used in Cologne displays)

slp1 code point character
Z h U+0331 h with COMBINING MACRON BELOW
V h U+032C h with COMBINING CARON BELOW
funderburkjim commented 9 years ago

re 'And it's totally left out in SLP1 as now, or what? In the current Cologne implementation.'

  1. The transcoder files slp1_roman.xml and slp1_deva.xml for AP,AP90, MW72, SKD dictionaries contain the correct codings for Z, V as described above.
  2. The Roman Unicode displays correctly display the Z,V unicode for IAST.
  3. The Devanagari Unicode displays typically don't properly display the VEDIC SIGN ARDHAVISARGA.

To remedy 3 requires use of an adequate web font (Siddhanta) for Devanagari. For a display to use a web font, the CSS file used by the display requires certain coding. And, the Javascript of the display require changes. Experience suggests that these changes sometimes have side effects that require other changes.

So, currently, I am choosing to defer making the Devanagari look right for the ardhavisarga character. Since the characters Z,V occur so rarely (only in the words listed above), this seems acceptable.

gasyoun commented 9 years ago

Sure, 8 words is not a big list, still we must document this issue. If documented, not that bad at all.

drdhaval2785 commented 3 years ago

I also saw some occurrences in vcp.txt. Is transcoder handling the jihvAmUlIya and upadhmAnIya properly @funderburkjim ?

Any transliteration schemes leading to lossy conversion?

funderburkjim commented 3 years ago

For Devanagari, transcoder handling properly, I think.

Doubt it for other transliterations, such as IAST (slp1_roman.xml), etc.