Closed GoogleCodeExporter closed 9 years ago
Just noticed that in my report numbering the bugs as Bug 1, Bug 2 etc was not
a good idea because they actually conflict with older bug reports. My Bug 1
and Bug 2 are shown with blue over-strike which is perhaps used for indicating
solved. So we need to fix this. Can you suggest / implement a way out?
K. Sethu
Original comment by skhome@gmail.com
on 23 Jan 2011 at 8:10
Hereafter I will denote the bugs as 6th bug, 7th bug... etc to avoid the
problem I mentioned in comment #2 above
There are two screen-shot attachments to this message.
6th bug - See attached Screen shot file: bugs_6-7.png
--------------------------------------------------------------------------------
--
Column 19 - The addition of the glyph for consonant character ஶ to U+0BB6 is
required
7th bug - See attached Screen shot file: bugs_6-7.png
--------------------------------------------------------------------------------
-----
Column 24 : on க்ஷ and க்ஷ
The code points sequence for conjunct ligature of KSSA is {U+0B95 U+0BCD U+0BB7}
The code points sequence for split form of KSSA is {U+0B95 U+0BCD U+200C U+0BB7}
In my tabulation in column 24 conjunct form is used and in column 25 it is
split form having the ZWNJ (U+200C) in between U+0BCD and U+0BB7.
As it is seen in the screen shot in SETT browser both forms appear split.
You need to add the ligature க்ஷ for the conjunct form in a vacant slot
and map the sequence U+0B95 U+0BCD U+0BB7 to it. Also, after such change, it
should be assured that the sequence {U+0B95 U+0BCD U+200C U+0BB7} would
continue to be mapped to the split form.
8th bug - See attached Screen shot file: bugs_8-10.png
--------------------------------------------------------------------------------
-----
see under "Ligature for Srii/Shrii" (ஶ்ரீ / ஸ்ரீ )
The grnadha script equivalent to Sri is a ligature of sequence of Unicode code
points. The sequence was changed from Unicode version 4.1 onwards as follows:
Before Unicode version 4.1 - it was {U+0BB8 U+0BCD U+0BB0 U+0BC0} and after
ver 4.1, the current standard definition is {U+0BB6 U+0BCD U+0BB0 U+0BC0}
The old ligature definition has not been deprecated in fonts. Most fonts still
haven't included the current definition other than Lohit Tamil, recent MS Latha
& Arial Unicode MS and few others which have included current definition but
they retain old alsofor backward compatibility.
Further, although Sri Lanka's SLS standard for Tamil and Tamil Nadu Govt's
Unicode implementation standard specify that the current standard be used in
key-maps, there are plenty of key-maps which have not made the switch.
So I recommend that for the present the older definition {U+0BB8 U+0BCD U+0BB0
U+0BC0} mapping to the Sri ligature be continued and additionally include the
current standard {U+0BB6 U+0BCD U+0BB0 U+0BC0} mapping to the same ligature.
(note that even if the 6th bug mentioned above is not rectified, this addition
can be made)
9th bug - See attached Screen shot file: bugs_8-10.png
--------------------------------------------------------------------------------
-----------
There are 9 Tamil symbols to be added to DhanikaSETT.ttf in their respective
Unicode code point slots. They are:
ௐ (U+0BD0), ௳ (U+0BF3), ௴ (U+0BF4), ௵ (U+0BF5), ௶ (U+0BF6), ௷
(U+0BF7) ௸ (U+0BF8), ௹ (U+0BF9), ௺ (U+0BFA)
10th bug - See attached Screen shot file: bugs_8-10.png
--------------------------------------------------------------------------------
-----------
All the Tamil digits and numbers are covered by SETT browser except for the
Tamil digit zero at Unicode code point U+0BE6 and so it has to be added.
Hope my reports here are sufficient and clear for further actions.
K. Sethu
Original comment by skhome@gmail.com
on 25 Jan 2011 at 5:20
Attachments:
Thanks a lot to K. Sethu for reporting these Tamil rendering issues clearly in
detail.
First I have to mention that I have a little knowledge about the Tamil language
since I'm a Sinhalese. I can only read & write Tamil, therefore I needed this
browser to be tested by a Tamil language specialist from the beginning of the
project. It seems that you have studied my work well & have a good knowledge of
what I've done. I thank you again for your contribution on this.
In order to implement the Tamil rendering support in SETT Browser I got the
support from an existing mapping algorithm implemented to map characters
between 'Latha' Unicode font & 'Bamini' legacy font.
Go to this url & view the source of the page.
http://www.ucsc.cmb.ac.lk/ltrl/services/feconverter/?maps=t_u-b.xml
And to create the DhanikaSETT.ttf custom font, I used the Tamil glyphs from
'Bamini' Tamil font. Since I used those 2 existing resources, I assumed that
they cover all the ligatures in Tamil Unicode. But with your report I
understood that 'Bamini' font haven't had all the required glyphs to map all
the ligatures in Tamil Unicode. And that existing mapping also hadn't covered
all the Tamil ligatures.
To fix all these rendering issues I need several things from a Tamil language
specialist. Hope you will help me for this.
1. A 3 column table consists of the following columns.
* Column1: Missing Tamil ligatures in SETT Browser (No need to mention the Unicode value, just type it)
* Column2: The sequence of symbols in a Tamil legacy font (eg. Bamini) to map that particular ligature
* Column3: Weather all the required symbols for the particular ligature already exist in the DhanikaSETT.ttf fonts or what are to be added
Note: Preparing a table like this will make easy the fixing of these bugs. You
can prepare an Excel table with these columns & attach it to this issue thread.
2. If you too accept that 'Bamini' font doesn't have all the symbols required
to map all the Tamil ligatures, please suggest me a better legacy font (not a
Unicode font) which consists of all the required symbols.
Note: If you are suggesting me an alternative font, please use that font for
the Column2 of above No.1's table. Otherwise you can use the 'Bamini' font font
that.
I will start working on this issue once I received the Missing ligature table
from you. You can add additional details to that table if you need to explain
something.
Looking forward for the table from you.
Thanks again!
Dhanika Perera
Original comment by dhanikap...@gmail.com
on 25 Jan 2011 at 9:46
Dhanika >> //2. If you too accept that 'Bamini' font doesn't have all the
symbols required to map all the Tamil ligatures, please suggest me a better
legacy font (not a Unicode font) which consists of all the required symbols.//
Why not Unicode font which is GPLed?. You only need to cull missing glyphs?
K. Sethu
Original comment by skhome@gmail.com
on 25 Jan 2011 at 11:19
A legacy font instead of a Unicode font has been used to extract the glyphs
because of the following reasons:
* When the symbols (parts of ligatures) from a legacy font are used, a limited number of font glyphs can be used to map all the ligatures.
* If the composite glyphs from a Unicode font is used instead, a large number of glyphs has to be added to DhanikaSETT.ttf font to represent all the ligatures uniquely. Then the size of the font file will be increased & that will cause problems when the font gets automatically downloaded to the browser.
* And also the free spaces from several number of non-Tamil script ranges have to be used to place all those glyphs & that will be problematic when the SETT Browser extends its language support for other complex scripts in future.
* Another thing is that will also increase the lines of code in the mapping algorithm & that will also be problematic & cause the rendering get slow down.
Therefore I prefer a GPLed legacy font. I would be glad if you can suggest me
such a font having all the required glyphs. Thanks!
Original comment by dhanikap...@gmail.com
on 26 Jan 2011 at 4:16
After a deep research on this issue, it was decided to ignore this issue since
the reported - not supported characters are only being used in advanced Tamil
text & assuming that this web browser is not for the purpose of reading
advanced Tamil text since this is a mobile web browser. All the Tamil
characters which are used in normal Tamil text are already available in this
browser & therefore the normal users will not be affected by this issue.
Original comment by dhanikap...@gmail.com
on 21 Mar 2011 at 4:50
Original issue reported on code.google.com by
skhome@gmail.com
on 23 Jan 2011 at 8:02Attachments: