w3c / iip

Documenting gaps and requirements for support of Indic languages on the Web and in eBooks.
https://w3c.github.io/iip/
9 stars 15 forks source link

deva-gap: Look at glyph repertoires for fonts section #13

Open r12a opened 6 years ago

r12a commented 6 years ago

I'm creating this issue based on comments raised by @vivekpani during our telecon.

vivek: There is no definition of appropriate glyph set. muthu: The challenge for font developers is to decide what all half and full forms to add in fonts.

It sounds like we should allude to this as an issue for use of the Web in the gap-analysis doc(s), and produce some guidance as part of the requirements document(s).

tiroj commented 6 years ago

Guidance for font developers or Web developers?

The glyph set of a Devanagari font is determined by the style of the typeface, the supported languages, and the ingenuity of the type designer. Certain glyphs can be anticipated, but I'd be wary of attempting to define standard glyph set guidelines that would be appropriate in all cases.

vivekpani commented 6 years ago

Sorry for a silence on this. Here are some of the reasons I suggested this to be a gap that seems significant enough for all Indian languages and not the Devanagari based alone. In fact, encoding and fonts are the two major gaps that need urgent attention. I will highlight encoding concerns separately. I am not an expert on fonts though I have worked with font developers extensively. While Devanagari conjuncts have mostly been linearised, not all other Indian languages do so. When I studied Hindi in school and later got myself exposed to early prints, there have been quite a few characters I was unable to recognise but those are all alternate forms (that exist in nearly every script) and are therefore font specific. This is definitely the premise of a designer. But, for easier and wider readability, certain forms have become the norm. Digital printing has been adopted in India in the past two decades and before that, mechanical types were in use that always had a finite and fixed set of glyphs/conjuncts. The above was possible because there are only a finite set of conjuncts that have been there in Indian languages. Several conjuncts used to represent foreign languages have actually been invented by designers in the absence of any guidelines and hence, has posed readability and display challenges. As a common example, क and ट do not join in any Indian languages and there are no native words. But, such conjuncts have been introduced to represent words like doctor, tractor etc. which are very common in use. However, since Devanagari is linearised in display, a stacked conjunct need not be created and it doesn't impact readability so much. But, in most other scripts, this is created "or" skipped by different designers and the creations are new to native eyes and often confusing and unreadable. Similarly, words like fox or box make क join स. Whereas, क joins only ष (among स, श and ष) and forms a ligature क्ष. Unless there are guidelines (if not standards, this will be unstoppable and will remain a compatibility nightmare on the web which will grow with time. (I have tried to give very popular examples here but the absolute numbers are numerous.) The OpenType format is complex and hard for a designer to understand. Though this has been argued against right from the start and simplified formats have had no popular reach, time has proven that the fonts industry, designs and styles of complex scripts have been grinding and the publishing industry has refused to migrate from legacy primarily because of lack of their signature styles. The number of OpenType fonts for Indian scripts are very few whereas, the number of font designers and companies designing fonts were several and growing in the late 1990s and early 2000s. The only OpenType fonts that exist today are perhaps the ones created by resource rich efforts from Google or Microsoft or the sorts. Unless this is addressed, I am not sure this gap is easy to close. I have been connected with this industry for more than two decades and in spite of the best help, there is much struggle. If majority of the industry manages with legacy software digital publishing, the glyph set variation need not be left in the open. Setting a guideline for OpenType may make it easy for designers and help adoption. There had been earlier attempts to standardise glyph (CDAC's own effort of releasing ISFOC as a standard. ISFOC - Indian Scripts Font Code) sets and their legal joining possibilities which had been successful but with standards moving away, such efforts have had been abandoned but at least the success does establish that it helped growth and adoption. The following is an article on this gap. https://www.linkedin.com/pulse/decimation-our-literature-year-2000-vivekanand-pani/