Rare Accent Coding Issues (ढ्य१ः॒᳠)

gasyoun commented 9 years ago

In http://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/2013/web/webtc/indexcaller.php for aMhati pwg but as per scan http://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/2013/web/webtc/servepdf.php?page=1-0005 the accent should be above १ - wonder if possible at all, any clue?

funderburkjim commented 9 years ago

The scan does look like accents pertain to '1', not to visarga. There are actually two accents, svarita and anudatta.

Here's the coding in pwg.txt and pwg.xml at that point:

dU¸Dhya1Hªª¸    (ªª = svarita and  ¸  = anudatta   (that's not the usual comma for anudatta)

and in pw.xml, which uses slp1:

dU\Qya1H^\    (^ = svarita, \ = anudatta).

Plausibly, a correction should be entered for this (to put accents after '1' rather than 'H')

http://www.sanskrit-lexicon.uni-koeln.de/scans/PWGScan/2013/web/webtc1/help/accents.html explains correspondence between visual representation of accents in (a) Unicode of display and (b) PWG scans. I find it confusing that that Unicode udatta looks like scan svarita.

drdhaval2785 commented 9 years ago

Leave udAtta and svarita as of now. They have been confusing for ages. But putting an accent after H is severely odd. Plausibly, a correction should be entered for this (to put accents after '1' rather than 'H') is what I would do and close the issue :)

gasyoun commented 9 years ago

First has to been done. @funderburkjim will not get there anytime soon, I guess. :camel:

drdhaval2785 commented 3 years ago

mA na^H samasya dU\Qya1^H\ pari^dvezaso aMha\tiH in pwg.txt So, the accent has moved after 1 and before H. Closing this hairsplitting issue.

Andhrabharati commented 2 years ago

It's very surprising to see the svara characters being wrongly rendered throughout the PWG text display at Cologne.

I am using the same example screenshot (given by gasyun) at the top of this issue-

Untitled

The book has the udAtta symbol [ ꣫] (1. अंहति꣫ and 2. अं꣫हति) at the HWs, and the regular udAtta sign [ ॑ ] throughout the text.

See the displayed string cited at the starting of this aMhati word

  यू॒यम॒स्मान्न᳠यत॒ वस्यो॒ अच्छा॒ निर᳠ंह॒तिभ्यो᳠ मरुतो गृणा॒नाः ,

as against the actual Ṛgveda text

  यू॒यम॒स्मान्न॑यत॒ वस्यो॒ अच्छा॒ निरं॑ह॒तिभ्यो॑ मरुतो गृणा॒नाः। जु॒षध्वं॑ नो ह॒व्यदा॑तिं यजत्रा व॒यं स्या॑म॒ पत॑यो रयी॒णाम्॥

[This actual text string is taken from @gasyoun's contribution of rvlinks.]

Wonder how the Ṛgveda udAtta [ ॑ ] has become the Ṛgveda svarita [ ᳠ ] in here!!

[Marcis says he looks at (Cologne) PWG almost daily and he seemingly being "fond of" Ṛgveda, has never felt this "odd/strange".]

gasyoun commented 2 years ago

The answer might be in order of elements. Siddhanta font should not be the reason. I do not know how to fix it.

Andhrabharati commented 2 years ago

I do not know how to fix it.

@gasyoun

Normally the Vedic udātta is not marked in any text; and it’s the creative mind of Böhtlingk that got the superscript उ [ ꣫ ] to mark the same in his works.

One must give due respect to his invention and try to replicate the same to the extent possible. [And this is just what I did in my file.The character is the combining devanagari letter u (U+A8EB)]

0951 [ ◌॑ ] DEVANAGARI STRESS SIGN UDATTA = Vedic tone svarita to be used instead of 1CE0 [ ◌᳠ ] VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA.

[As Dhaval has commented somewhere, there seems to be a confusion between Devanagari udātta sign (used for Vedic svarita accent) and Vedic udātta accent; both are quite different.]

It is as simple as this, to fix the matter!!

Hope @funderburkjim and @drdhaval2785 agree to this. [Sorry that I am posting in a closed issue, but this looked to be the appropriate place, instead of opening another issue.]

Andhrabharati commented 2 years ago

Felix Rau has described the matter (Böhtlingk's use of Vedic udātta) in his Vedic Accent and Lexicography (as a part of University of Cologne – Lazarus Project).

Andhrabharati commented 2 years ago

This document has a quick and concise summary about Vedic accent marks. 07396-vaidika.pdf

I would suggest that @funderburkjim and @gasyoun spend a little time (<10 min) to go through the document to know the details. [No one knows which information would be needed in future,]

funderburkjim commented 2 years ago

A review

Currently, all text (in the Cologne digitization) intended to be rendered in Devanagari is represented in SLP1 transliteration. An accented letter is represented by one of three symbols following the letter :

/ : udAtta
'\' : anudAtta
^ : svarita

Note there are not special symbols for, e.g., Vedic udAtta , or Böhtlingk superscript 'u' udAtta, etc. These and the many other accent variations may or may not have representation in SLP1 (or in other transliteration schemes) as described in Peter Scharf's LIES text or in this document on accents

As to the rendering into Devanagari of accents in the Cologne sanskrit-lexicon displays, this governed by just one transcoding file, which is the same

for all dictionaries, whether PWG or not
for all text,
- whether from Vedic sources or not,
- whether headwords text or not

This rendering into Devanagari is governed by one parameter file: slp1_deva.xml.

The lines of slp1_deva.xml pertaining to accents are:

<e> <s>INIT,SKT</s> <in>-</in> <out>-</out> <next>INIT</next></e>
<!-- anudAtta accent: 0952 -->
<e n='122'> <s>INIT,SKT</s> <in>\</in> <out>\u0952</out>  <cl>accent</cl></e>
<!-- udAtta accent : u0951 -->
<e n='123'> <s>INIT,SKT</s> <in>/</in> <out>\u0951</out>  <cl>accent</cl></e>
<!-- u1ce0 = vedic tone rigvedic kashmiri independent svarita.-->
<e n='124'> <s>INIT,SKT</s> <in>^</in> <out>\u1ce0</out> <cl>atom</cl> </e>

<!-- candrabindu, anunasika -->
<e> <s>INIT,SKT</s> <in>~</in> <out>\u0901</out> <next>INIT</next></e>
<!-- OM -->
<e> <s>INIT,SKT</s> <in>o~</in> <out>\u0950</out> <next>INIT</next> </e>
<!-- M~ destroys invertability, so dropped.
<e> <s>INIT,SKT</s> <in>M~</in> <out>\u0901</out> <next>INIT</next></e>
-->
<!-- 
< Z =  jihvamuliya = h with line below
< V =  upadhmaniya = h with breve below
-->
<e> <s>INIT,SKT</s> <in>Z</in> <out>\u1cf2</out> <next>INIT</next></e>
<e> <s>INIT,SKT</s> <in>V</in> <out>\u1cf2</out> <next>INIT</next></e>

<!-- Accents with visarga and anusvara
Such instances occur in current digitization of PWG.
Should this be coded (in slp1) as \H, \M or H\, M\.
The current display code (as of 01-08-2021) assumes 'accent after', e.g. H\, M\
(refer csl-websanlexicon/v02/makotemplates/web/utilities/transcoder/slp1_deva.xml)
We follow that here.
-->
<e n='122a'> <s>INIT,SKT</s> <in>H\</in> <out>\u0903\u0952</out>  <cl>accent</cl></e>
<e n='122b'> <s>INIT,SKT</s> <in>M\</in> <out>\u0902\u0952</out>  <cl>accent</cl></e>
<e n='123a'> <s>INIT,SKT</s> <in>H/</in> <out>\u0903\u0951</out>  <cl>accent</cl></e>
<e n='123b'> <s>INIT,SKT</s> <in>M/</in> <out>\u0902\u0951</out>  <cl>accent</cl></e>
<e n='124a'> <s>INIT,SKT</s> <in>H^</in> <out>\u0903\u1ce0</out> <cl>atom</cl> </e>
<e n='124b'> <s>INIT,SKT</s> <in>M^</in> <out>\u0902\u1ce0</out> <cl>atom</cl> </e>

funderburkjim commented 2 years ago

So, we could change u1ce0 to u0951 in slp1_deval.xml.

Then all text marked with a svarita accent would be rendered with u0951.

But then all text marked with a udAtta accent would also be rendered with u0951.

We would lose the 'invertibility' property, but maybe invertibility doesn't matter for the display.

Also, other dictionaries (such as CAE) with text marked with a svarita accent would have their accented Devanagari display altered similarly.

Also, this one change would have no impact on the display of udAtta accent in PWG headwords.

It seems to me that the situation with accent-representation is similar to the case with Latin diacritics; namely, there are many different schemes that appear in published works, whether dictionaries or not. At least the current scheme used in the Cologne Sanskrit lexicon is consistent and unambiguous.

But, speaking personally, Devanagari accents have basically no utility to me. If I still have not mastered the vocabulary of even Hitopadesha, why worry about Devanagari or Vedic accents?

But I realize others may not view accents this way, and am willing to make changes to the display details to accomodate other views. I would like others to come to a consensus before proceeding with technical changes to slp1_deva.xml or elsewhere.

Andhrabharati commented 2 years ago

Note there are not special symbols for, e.g., Vedic udAtta , or Böhtlingk superscript 'u' udAtta, etc. These and the many other accent variations may or may not have representation in SLP1 (or in other transliteration schemes) as described in Peter Scharf's LIES text or in this document on accents

Whatever the transliteration (which is just a convenient internal representation) used in typing the text, I would say one should see to it that the final output should tally with the print version, and follow the "standard" conventions in vogue.

See what LIES mentions in the analysis pages (p.46) about this, though ultimately it was abandoned in SLP1 notation.

Indeed, Böhtlingk and Roth, in their Sanskrit-wörterbuch, and Whitney, in his Sanskrit Grammar, abandon the system and instead adapt to Devan¯agar¯ı the system used to mark accent in Roman script. They indicate only what they consider to be the “really accented syllables”: high pitch by means of an o above and an independent circumflex by a vertical line above (Whitney, 1889, 31).

When there is no SLP1 representing for the BR's Vedic 'u', I wonder why it was typed with a wrong character leading to the mess as seen here.

They could have used a spl. symbol to indicate such ones. There is one such in SKD text to denote a Vedic symbol which is there in very few fonts (Siddhanta has it in pvt. area), the same symbol which is also used to denote the small cap words in the original 'raw' file of PWG. This is still remaining in the SKD text, unconverted to unicode.

Anyway, I just expressed my view, as the result in this PWG is looking VERY ODD for my eyes.

You have your own rules and I should not go any further in dragging the issue.

Andhrabharati commented 2 years ago

One final comment and I will leave the topic.

As you had added some new characters into the Cologne "assumed" IAST set when I pointed out about some characters in MW text, you may add some new characters into the Cologne SLP1 set also, for cases like this.

(Any standard, rule or law can be updated/amended when really needed; they need not be rock-solid, fixed for ever.)

Thus it would have the 'invertibility' for you and also satisfy 'picky' guys like me who strive for perfection.

Andhrabharati commented 2 years ago

They could have used a spl. symbol to indicate such ones. There is one such in SKD text to denote a Vedic symbol which is there in very few fonts (Siddhanta has it in pvt. area), the same symbol which is also used to denote the small cap words in the original 'raw' file of PWG. This is still remaining in the SKD text, unconverted to unicode.

This is about the different characters encountered upon, while typing the texts. One can adopt/use some ad-hoc new symbol, hitherto unused, to key-in those characters. And this should be appropriately treated/handled in the actual usage of the text.

The character I am talking about is the ¤.

Andhrabharati commented 2 years ago

I would like others to come to a consensus before proceeding with technical changes to slp1_deva.xml or elsewhere.

From the earlier posts in here-

[@Andhrabharati]

He is the proposer for the change, and is FOR it.

[@gasyoun]

The answer might be in order of elements. Siddhanta font should not be the reason. I do not know how to fix it.

He seems to be FOR it, but apparently has no solution in mind to fix the matter.

[@funderburkjim]

But I realize others may not view accents this way, and am willing to make changes to the display details to accomodate other views.

He seems to have nothing AGAINST the proposal (except for losing the 'invertibility', for which also I had given a suggestion), and willing to change.

[@drdhaval2785]

Leave udAtta and svarita as of now. They have been confusing for ages. But putting an accent after H is severely odd.

His earlier opinion is to leave the issue (as there is some confusion in the usage), but he also feels the "severely odd" cases are to be resolved.

Let's wait for his current opinion and conclude.

Andhrabharati commented 2 years ago

Accents with visarga and anusvara

Such instances occur in current digitization of PWG. Should this be coded (in slp1) as \H, \M or H\, M.

AFAIK, the 3 general accents (udAtta, anudAtta and svarita) are to be marked after the vowels only. The dual letters (अनुस्वार & विसर्ग) accents are to be marked differently. The consonants do not have accents on them.

As such (\H, \M) is the proper way of marking, not (H\, M\).

The same has been mentioned earlier in this thread itself.

Plausibly, a correction should be entered for this (to put accents after '1' rather than 'H')

So why the doubt again?

funderburkjim commented 2 years ago

Revise Devanagari Accent display for PWG

It is possible to mimic the display of accents in PWG, without an 'invertibility' problem. The new slp1_deva1.xml transcoding file (instead of prior slp1_deva.xml) for PWG with Devanagari accents, now has

<!-- anudAtta accent: 0952 -->
<e n='122'> <s>INIT,SKT</s> <in>\</in> <out>\u0952</out>  <next>SKT</next></e>
<!-- udAtta accent : ua8eb -->
<e n='123'> <s>INIT,SKT</s> <in>/</in> <out>\ua8eb</out>  <next>SKT</next></e>
<!-- u0951 = vedic tone rigvedic kashmiri independent svarita.-->
<e n='124'> <s>INIT,SKT</s> <in>^</in> <out>\u0951</out> <next>SKT</next> </e>

Here is display of aMhati that results:

The comparison to scan is now quite close.

limitations

This change is only implemented in the display shown above (e.g. it is not implemented yet in Basic Display or Advanced Search display.)
The change only affects PWG. Devanagari accent display is handled as before for other dictionaries.
There are still issues in the few cases involving M and H. For examples:
- headword KAdoarRas

Request feedback on this revision before porting the change to Basic Display.

In the meantime, I'll try to get a handle on the M-H problems.

Andhrabharati commented 2 years ago

@funderburkjim,

I can't believe my eyes! Guess BR (if they could see this) will also be happy now.

Which font is being used, still Siddhanta?

The same display correction is to be applied to PWK as well; PWG is not alone having such a marking.

Andhrabharati commented 2 years ago

BTW, though this is not visible to the outside world, pl. change  as 

Andhrabharati commented 2 years ago

If you use the Adishila font, no need to do anything extra for the M and H cases. They all are just fine in that font.

BTW, why is the accent before H in case of "1. aMhati" (after दूढ्य) and "KAdoarRas" (after नद्य) different? Is this what you mentioned/observed?

funderburkjim commented 2 years ago

Agree that spellings should not have accents (SLP1 / \ ^ ) after visarga, anusvara (H,M). Have changed the 17 such cases accordingly (such as dU\Qya1^H\ -> dU\Qya1^\H under headword aMhati). See above commit.

The changes are shown here

Using the same transcoding rules as yesterday, we now have an error in display of those 17 cases; for instance, dU\Qya1^\H displays improperly

I believe there are two ways to solve this Devanagari display problem:

revise the transcoding rules (slp1_deva1.xml), keep using siddhanta font
simplify the transcoding rules, and use adhishila font.

Next comments will discuss.

Andhrabharati commented 2 years ago

In this context, I would like to bring back to your notice-- the proposal (elsewhere) about using the Adishila font, which has no issue with the -ya conjuncts. [I do not wish to list out many other glyph issues that I had noticed later in the Siddhanta font, as this is not a forum to talk about that.]

You seemed to have agreed there, to try it out and asked about the font version to take.

funderburkjim commented 2 years ago

font example1 experiment

https://sanskrit-lexicon.github.io/PWG/misc/accentdisplay/example1.html

In this web page,

a small number of spelling variations involving accents and visarga/anusvara.
font choices of siddhanta, adhishila, and the default font
- In Windows 10 desktop, the default font used for Devanagari is nirmala font. It would be a different font when viewed in a browser on other operating systems.
For emphasis, each example includes the specific sequence of Unicode code points that is being rendered in the three fonts
- The unicode code point sequence is generated programmatically by transcoding rules applied to the slp1 spelling shown. The transcoding rules used are in slp1_deva2.xml, which uses no re-ordering tricks. Thus the ordering of unicode code points is same as ordering of slp1 characters.

inference

From this small number of examples, one may conjecture:

The display with siddhanta or default font is sensitive to whether accents appear before visarga/anusvara.
- The display is correct when accents come after visarga/anusvara
  - in particular, the spelling (according to slp1 and unicode code point sequence) has to be wrong for the display to be right in siddhanta/default fonts.
adhishila font displays any of the spellings (right or wrong) in a reasonable way. The right spelling is preferable; compare:
- a/H (right spelling) and accent glyph placed over 'a' glyph
- aH/ (wrong spelling) and accent glyph placed over visarga glyph.

From these few examples, the only quibble I have with adhishila font is that the A8EB COMBINING DEVANAGARI LETTER U glyph is too small.

Andhrabharati commented 2 years ago

It may be observed that even other characters are smaller, as compared with other fonts.

The reason being the Adishila font has quite many composite characters extending on either side vertically (top or bottom)-- not in many other fonts-- all to be fit into the same overall height, and thus the actual letter size looks a little smaller.

The simplest way out is to increase the font size property; as I saw a 4:3 ratio wrt other fonts looked well, say 24pt to 18pt. This much difference will let one guess how much "material" is "present" (packed) additionally at the top side and bottom side of the glyphs in this font.

Quite a big effort was put in creating this font, to make all characters flawlessly rendered, and with aesthetics!!

Andhrabharati commented 2 years ago

It is the famous Nirnayasagara's font (over a century and half old foundry font) that was the inspiring basis for this Adishila font. It (the NS font) WAS (and IS) a world class font by all standards.

funderburkjim commented 2 years ago

It might be desirable to replace all uses of siddhanta font with adhishila at sanskrit-lexicon web site.

But, I would like to do a more systematic comparison first.

One aspect of siddhanta font favored by @drdhaval2785 and @gasyoun was in the rendering of conjunct consonants. So I would like to do a side-by-side comparison of some large collection of conjunct consonants in each font.

Do you know of a good list of conjunct consonants to use for such a comparison?

drdhaval2785 commented 2 years ago

http://rb.vertimus.co.uk/sanskrit/conjuncts/index seems to have collected conjuncts from various sources.

Andhrabharati commented 2 years ago

Please go for the pure Adishila San, not the Adishila having the IAST and other Roman diacritics.

For this non-devanagari "material", you anyway are using another font.

Andhrabharati commented 2 years ago

I understand that a text file is available with @gasyoun, for testing the devanagari font glyphs.

But I guess, @funderburkjim can programmatically make a more exhaustive one, as I did myself. [I also play around with fonts and studying their features; and did make some fonts myself for our internal use.]

Andhrabharati commented 2 years ago

Just looked around and got this with the initial step of two consonants, while Sanskrit is known to have a max. of 5 consonants together in practice.

https://r12a.github.io/scripts/apps/conjunct_generator/?preset=deva

With this info, @funderburkjim should be able to generate a file for himself, to test the font rendering; probably he could make a web-page as well like this, for future use!!

funderburkjim commented 2 years ago

Common conjunct consonants

A list of about 150 common conjuncts from the first MW source referenced above by Dhaval. example3 display

shows the conjuncts in siddhanta, adhishila, default fonts
font-size adjusted upward 30% for adhishila
the basic adhishila font set is being used

Based on this list, can you state a preference?

siddhanta preferred for all dictionaries (what specific conjuncts do you prefer in siddhanta ?)
adhishila preferred for all dictionaries (what specific conjuncts do you prefer in adhishila ?)
no preference
Not enough conjuncts are shown to decide
- please provide guidance as to what additional conjuncts are needed

Feedback requested.

drdhaval2785 commented 2 years ago

I think the Siddhanta font is not being displayed. I see Adhishila and default, but Siddhanta is being rendered the same as default font on my mobile, on firefox and chrome both.

drdhaval2785 commented 2 years ago

Screenshot_20210812-083916_Chrome

funderburkjim commented 2 years ago

Yes - I agree with you. Trying to find the error now.

funderburkjim commented 2 years ago

@drdhaval2785 -- try it now.
Problem was a misplaced semicolon in css.

drdhaval2785 commented 2 years ago

The following are the conjuncts which differ visually in both fonts. I have shown my preference against each conjunct with A for Adishila and S for Siddhanta. Adhishila wins 13:5 against Siddhanta.

cCa:A
jja:S
Yca:A
YCa:A
Yja:A
qya:A
tna:S
bja:A
Sca:A
ktya:S
gBya:A
Nkya:S
cCya:A
ttra:A
ddya:A
dDya:A
dBya:A
arvva:S

drdhaval2785 commented 2 years ago

A fuller list of conjuncts is placed at https://github.com/sanskrit-lexicon/COLOGNE/issues/354 in case we want to compare look and feel of any fonts.

Andhrabharati commented 2 years ago

In this context, I would like to bring back to your notice-- the proposal (elsewhere) about using the Adishila font, which has no issue with the -ya conjuncts.

As I've been looking into the Bengal based prints (SKD, VCP) in Devanagari now, I MUST say that the -ya conjuncts in Siddhanta font are NOT faulty. But definitely (slightly) mis-leading the eye, they are.

It is just that the font creator took Bengali style glyphs as the basis. There are two variants, Bengal type and Bombay type, prevailing in Devanagari fonts.

Andhrabharati commented 2 years ago

In the list of conjuncts that @drdhaval2785 posted, I feel only the pna is slightly bad in Adishila, the "post na-form" should've been more protruding out downwards as in bhna and mna.

And I agree with @drdhaval2785 that tna should also have the similar form, that's the only odd-man-out.

funderburkjim commented 2 years ago

1700 Vacaspatyam conjuncts

This provides font comparison similar to example3 above.

example3a display

Does @gasyoun have an opinion ?

Andhrabharati commented 2 years ago

@funderburkjim,

Looking closely at the examples you gave, it appears what you take as the windows default font is actually Google's Noto Serif Devanagari.

The glyphs are not matching with Nirmala or any other windows fonts!!

funderburkjim commented 2 years ago

default font

On my computer os (Windows 10 Pro) , when viewing, say, example3a.html, the default font is being rendered with Nirmala UI. This is known by using the browser 'inspect' option.
(Right click on the default rendering of 'na', choose inspect. Then click 'computed' and scroll down. My system shows the following. Image from Chrome browser. Edge browser similar)

Andhrabharati commented 2 years ago

And here are the files I prepared today afternoon, considering all practical occurrences.

Sanskrit Conjuncts comparison.pdf

Surprisingly a new candidate looks to have won the competition, the Skt font, in the conjuncts. It is the Microsoft's specially made font for Sanskrit, coming with Devanagari optional Font package.

PWG accents comparison.pdf

And the Adishila is unmatched as far as the accents are concerned, beating even Microsoft and Google. What a special attention the creators might've taken, in making it!!

Pl. look at the pdf files at 100% or more size, not at a lesser size.

Andhrabharati commented 2 years ago

Here is the Sanskrit Text font by Microsoft.

sanskr.zip

Andhrabharati commented 2 years ago

default font

On my computer os (Windows 10 Pro) , when viewing, say, example3a.html, the default font is being rendered with Nirmala UI. This is known by using the browser 'inspect' option. (Right click on the default rendering of 'na', choose inspect. Then click 'computed' and scroll down. My system shows the following. Image from Chrome browser. Edge browser similar)

I just would like you to see how the Nirmala & Noto fonts actually look like, as I gave all of them in my file.

Andhrabharati commented 2 years ago

Looking at the debug (inspect) screenshot you gave, checked my system; it is my browser setting (set long back) having the Noto font as default that has deceived me.

Now selected the webpage to have its own font, and it changed the appearance as it is meant to be..

Sorry for my hasty comments.

Andhrabharati commented 2 years ago

In this context, I would like to bring back to your notice-- the proposal (elsewhere) about using the Adishila font, which has no issue with the -ya conjuncts.

As I've been looking into the Bengal based prints (SKD, VCP) in Devanagari now, I MUST say that the -ya conjuncts in Siddhanta font are NOT faulty. But definitely (slightly) mis-leading the eye, they are.

It is just that the font creator took Bengali style glyphs as the basis. There are two variants, Bengal type and Bombay type, prevailing in Devanagari fonts.

Looked inside the Siddhanta font.

It indeed has quite many glyphs and variants internally.

half -ya forms in Siddhanta

Now I know the issue! Instead of keeping the para-ya variants 3 & 4 [2nd line, elements 6 & 7] as defaults (which is the way the prints books have the shape), the font creator somehow chose the variants 1 & 2 [2nd line, elements 4 & 5].

One can correct this by going inside the font to make it look as per the book, or even match the Bombay style, by chosing the primary para-ya [1st line, elements 8 & 9].

This makes it flawless.

@gasyoun,

Do you think the designer can be approached to change this, or can we do it for Cologne project ourselves?

Andhrabharati commented 2 years ago

Browsed through the list of @drdhaval2785, as rendered by @funderburkjim, and the summary is-

The following are with bad joiners (variants 3 & 4)- (zwya, rzwya); (qya, Rqya, qqya, rqya); (Qya, qQya, RQya); (Wya, RWya); (cCya, Cya, YCya, SCya); (Nya, lNya); (qdya); (qdvya); (qrya)
Rest are with the Bombay style joiners only, and esp. the following ones have the Bombay style variants 1 & 2 mentioned above- (wya, Rwya, wwya, rwya); (hRya); (ktya, rktya); (wdya); (hnya, dgnya); (zwrya, dDrya); (llya); (hvya, dDvya, ktvya, drvya)

gasyoun commented 2 years ago

Do you think the designer can be approached to change this, or can we do it for Cologne project ourselves?

Please write to him bayaryn@gmail.com. We can't fix it on our own. I remember him telling it was because of technical font limitations.

funderburkjim commented 2 years ago

Here are the change transactions for the above commit: changes_twoaccent.txt

Mostly transforming '\^to '^\', and correcting a couple of '/\' . Now the only 2-accent spellings are^\'; and there are 601 of them.

Here is a frequency tabulation of preceding character to ^\:

1^\ 356
3^\ 240
A^\ 1  [viSvasyA^\rTina\H in anapacyuta]
e^\ 1  [manu^de^\va\yurya\jYakA^maH# in devayu]
a^\ 2  [nApara^\M in praTama] [martya^\M in bAhutA]
u^\ 1  [yu\vAku\ SacI^nAM yu\vAku^\  in yuvAku]

When further looking at the following character (when preceding character is 1 or 3)

356 matches in 355 lines for "1\^\\[HMBDGNSYcdghjkmnrstvyz #:]" in buffer: pwg_1.txt
    21 of these  have following H;  4 have following M; 96 are followed by space or # (i.e. end of word)
240 matches for "3\^\\[HBGScdghjmprstvyz #]" in buffer: pwg_1.txt
   3 followed by H, 180 followed by space or # (i.e. end of word)

sanskrit-lexicon / PWG