jihvamuliya and upadhmaniya coding and display

funderburkjim commented 9 years ago

In a few dictionaries, Sampada has found that missing data occurred where two visarga variants, jihvāmūlīya and upadhmānīya are being referenced. See, for example under SikzA in SKD.

This note is just to mention this oddity, and to indicate how these characters are coded, and how they are displayed. Also, to note some TODO items.

A reference for this is LIES (Linguistic Issues in Encoding Sanskrit) by Peter Scharf and Malcolm Hyman, downloadable from the publications section of Peter's web site at http://sanskritlibrary.org/tomcat/sl/-/pub/.

Another reference is http://en.wikipedia.org/wiki/Vedic_Sanskrit. Also, see http://unicode.org/charts/PDF/U1CD0.pdf.

Here's the coding:

variant	SLP1	IAST	Devanagari	Devanagari Used
jihvāmūlīya	Z	ẖ (\u1e96)	VEDIC SIGN JIHVAMULIYA (U+1CF5)	'VEDIC SIGN ARDHAVISARGA' (U+1CF2)
upadhmānīya	V	ḫ (\u1e2b)	VEDIC SIGN UPADHMANIYA (U+1CF6)	'VEDIC SIGN ARDHAVISARGA' (U+1CF2)

The Devanagari is not shown, since only some fonts (siddhanta and prajna) will display any of these. The reason for the 2nd Devanagari column is that this 1CF2 unicode character displays well with siddhanta, and, according to Peter, is often used for the display of both variants. Here is an image of how it looks, which Peter describes as a pair of candras, one smiling above one frowning.

TODO:

Make sure the transcoder files slp1_deva.xml and slp1_roman.xml code the display of Z and V as described above.
Change the displays for the dictionaries where these Z and V occur to use siddhanta web font for Devanagari output, so the characters will be displayed as well as possible.

gasyoun commented 9 years ago

And it's totally left out in SLP1 as now, or what? In the current Cologne implementation.

funderburkjim commented 9 years ago

dictionaries and words with Z,V

Dictionary	word	line in X.txt
AP	aGoza	1762
AP	ayoga	19395
AP	arDa	20956
AP90	aGoza	3633
AP90	ayoga	27815
MW72	arDa	19829
MW72	candra	72006
SKD	SikzA	411103,411246

funderburkjim commented 9 years ago

An alternate IAST Unicode coding of Z and V (not currently used in Cologne displays)

slp1	code point	character
Z	h U+0331	h with COMBINING MACRON BELOW
V	h U+032C	h with COMBINING CARON BELOW

funderburkjim commented 9 years ago

re 'And it's totally left out in SLP1 as now, or what? In the current Cologne implementation.'

The transcoder files slp1_roman.xml and slp1_deva.xml for AP,AP90, MW72, SKD dictionaries contain the correct codings for Z, V as described above.
The Roman Unicode displays correctly display the Z,V unicode for IAST.
The Devanagari Unicode displays typically don't properly display the VEDIC SIGN ARDHAVISARGA.

To remedy 3 requires use of an adequate web font (Siddhanta) for Devanagari. For a display to use a web font, the CSS file used by the display requires certain coding. And, the Javascript of the display require changes. Experience suggests that these changes sometimes have side effects that require other changes.

So, currently, I am choosing to defer making the Devanagari look right for the ardhavisarga character. Since the characters Z,V occur so rarely (only in the words listed above), this seems acceptable.

gasyoun commented 9 years ago

Sure, 8 words is not a big list, still we must document this issue. If documented, not that bad at all.

drdhaval2785 commented 3 years ago

I also saw some occurrences in vcp.txt. Is transcoder handling the jihvAmUlIya and upadhmAnIya properly @funderburkjim ?

Any transliteration schemes leading to lossy conversion?

funderburkjim commented 3 years ago

For Devanagari, transcoder handling properly, I think.

Doubt it for other transliterations, such as IAST (slp1_roman.xml), etc.

sanskrit-lexicon / COLOGNE

jihvamuliya and upadhmaniya coding and display #59