Tarobish / Katibeh

Katibeh Arabic Font Project
http://tarobish.github.io/Katibeh/
SIL Open Font License 1.1
13 stars 4 forks source link

Broken Characters in Urdu & Uyghur #79

Closed Tarobish closed 8 years ago

Tarobish commented 8 years ago

Something its broken in letter (Dotless Y screen shot 2016-01-14 at 3 25 45 pm screen shot 2016-01-14 at 3 31 35 pm screen shot 2016-01-14 at 3 22 57 pm screen shot 2016-01-14 at 3 30 13 pm screen shot 2016-01-14 at 3 26 01 pm

eh, Heh.fin and rnoon-ar.fina

graphicore commented 8 years ago

Where is this coming from (a pdf loaded into indesign is problematic, pdf is not made for this kind of thing)? Also we don't do specific things for Uyghur and just a tiny bit for Urdu.

Can you reproduce this in the live testing?

I will update the Generated documents in a minute.

Tarobish commented 8 years ago

Here is the sample from "persian-arabic2_Katibeh-Regular_18.pdf” also still i can see the extension parts here im not able to type this :(

On Feb 4, 2016, at 11:13 AM, Lasse Fister notifications@github.com wrote:

Where is this coming from (a pdf loaded into indesign is problematic, pdf is not made for this kind of thing)? Also we don't do specific things for Uyghur and just a tiny bit for Urdu.

Can you reproduce this in the live testing?

I will update the Generated documents in a minute.

— Reply to this email directly or view it on GitHub https://github.com/Tarobish/Katibeh/issues/79#issuecomment-180007410.

graphicore commented 8 years ago

copy and paste it from here: https://github.com/Tarobish/Katibeh/blob/master/Document-Sources/persian-arabic2.txt

graphicore commented 8 years ago

Uh, and pic is missing. Which page is it on?

Tarobish commented 8 years ago

I think we have to leave this issue let it be open i can see the same problem from our Source

http://tarobish.github.io/Katibeh/html/live-testing.html#?eJyVUU1Lw0AQ/StLTgq2bEKsNWf/gZ6kl6JRCjVKCSiKIGmaj9aDIP0LRkpLiOm5v2Mme/OX+HZL8eLFwy4zs++9eTP7ZF3dBqHlWfeulNaBFfoP4clghMIoHCL3b04Hjz5Sx70LkQ/7wTUyRBfg+YbaC2ip0mbVlJwJWlDRrATVKuWYCo45p5pzQaWaqxlVnDeIBI/VnGMdcqKfxF7H7orvl3dxLI/2DZ2WVIBRCJ5SoYm04Jw3gj7VnKotFe0444gjwL8AVBlVOGvBE10BfgwBqL1RbeiTnZ0PaNb6oFAC03Glac9ToGBa2PIfepxoTWOMN9pypFWo5Lyty2YVVMGtStRsO7PWwjISzO4Ju603kjaJgf0227l9FQ4QEfTX5v4b0wvwL5f90PyXtDst6bSke+bYniM9+7DtyO659fwD+BwYJA==

On Feb 4, 2016, at 1:09 PM, Lasse Fister notifications@github.com wrote:

Uh, and pic is missing. Which page is it on?

— Reply to this email directly or view it on GitHub https://github.com/Tarobish/Katibeh/issues/79#issuecomment-180050023.

Tarobish commented 8 years ago

it was the last page from "persian-arabic2_Katibeh-Regular_18.pdf”

On Feb 4, 2016, at 1:09 PM, Lasse Fister notifications@github.com wrote:

Uh, and pic is missing. Which page is it on?

— Reply to this email directly or view it on GitHub https://github.com/Tarobish/Katibeh/issues/79#issuecomment-180050023.

Tarobish commented 8 years ago

screen shot 2016-02-04 at 1 18 50 pm screen shot 2016-02-04 at 1 16 07 pm

graphicore commented 8 years ago

Is this about right: http://tarobish.github.io/Jomhuria/#live?eyJ2YWx1ZSI6Itis24fardqv2Ygg2KrYp9qtINiz24fZhNin2YTZidiz2Ykg2K/blduL2LHZidqv25Ug2YPbldmE2q/bldmG2K/blSAoNjE4IOKAkyA5MDcpINiz24fYrNin24vYpyDZitin2LHZidiq2YnZviDYqNuV2LHar9uV2YYg2KrZiNmC2YLbh9iyINmK24jYsduI2LQg2YXbh9iy2YnZg9inINiz25DYs9iq2YnZhdmJ2LPZiSDYptin2LPYp9iz2YnYr9inIDY0MCDigJMg2YrZidmE2YkgMTAg2YrbiNix24jYtCDZhduH2LLZidmD2Kcg2LPbkNiz2KrZidmF2YnYs9mJ2YbZiSDYqNuV2LHZvtinINmC2YnZhNiv2YkuINio24fZhNin2LEg2KrbhtuL25XZhtiv2YnZg9mJ2obblSA6IDEuINmD24fahtin2LEg2YXbh9iy2YnZg9mJ2LPZiSDYjCAyLiDZgtuV2LTZgtuV2LEg2YXbh9iy2YnZg9mJ2LPZiSDYjCAiLCJiaWRpIjoicnRsIiwibGFuZyI6ImFyIn0=

selection_030

Tarobish commented 8 years ago

Yes! Exactly

On Feb 4, 2016, at 1:39 PM, Lasse Fister notifications@github.com wrote:

Is this about right: http://tarobish.github.io/Jomhuria/#live?eyJ2YWx1ZSI6Itis24fardqv2Ygg2KrYp9qtINiz24fZhNin2YTZidiz2Ykg2K/blduL2LHZidqv25Ug2YPbldmE2q/bldmG2K/blSAoNjE4IOKAkyA5MDcpINiz24fYrNin24vYpyDZitin2LHZidiq2YnZviDYqNuV2LHar9uV2YYg2KrZiNmC2YLbh9iyINmK24jYsduI2LQg2YXbh9iy2YnZg9inINiz25DYs9iq2YnZhdmJ2LPZiSDYptin2LPYp9iz2YnYr9inIDY0MCDigJMg2YrZidmE2YkgMTAg2YrbiNix24jYtCDZhduH2LLZidmD2Kcg2 LPbkNiz2KrZidmF2YnYs9mJ2YbZiSDYqNuV2LHZvtinINmC2YnZhNiv2YkuINio24fZhNin2LEg2KrbhtuL25XZhtiv2YnZg9mJ2obblSA6IDEuINmD24fahtin2LEg2YXbh9iy2YnZg9mJ2LPZiSDYjCAyLiDZgtuV2LTZgtuV2LEg2YXbh9iy2YnZg9mJ2LPZiSDYjCAiLCJiaWRpIjoicnRsIiwibGFuZyI6ImFyIn0= http://tarobish.github.io/Jomhuria/#live?eyJ2YWx1ZSI6Itis24fardqv2Ygg2KrYp9qtINiz24fZhNin2YTZidiz2Ykg2K/blduL2LHZidqv25Ug2YPbldmE2q/bldmG2K/blSAoNjE4IOKAkyA5MDcpINiz24fYrNin24vYpyDZitin2LHZidiq2YnZviDYqNuV2LHar9uV2YYg2KrZiNmC2YLbh9iyINmK24jYsduI2LQg2YXbh9iy2YnZg9inINiz25DYs9iq2YnZhdmJ2LPZiSDYptin2LPYp9iz2YnYr9inIDY0MCDigJMg2YrZidmE2YkgMTAg2YrbiNix24jYtCDZhduH2LLZidmD2Kcg2LPbkNiz2KrZidmF2YnYs9mJ2YbZiSDYqNuV2LHZvtinINmC2YnZhNiv2YkuINio24fZhNin2LEg2KrbhtuL25XZhtiv2YnZg9mJ2obblSA6IDEuINmD24fahtin2LEg2YXbh9iy2YnZg9mJ2LPZiSDYjCAyLiDZgtuV2LTZgtuV2LEg2YXbh9iy2YnZg9mJ2LPZiSDYjCAiLCJiaWRpIjoicnRsIiwibGFuZyI6ImFyIn0= https://cloud.githubusercontent.com/assets/393132/12830529/0fc4785c-cb90-11e5-8599-5a3fe92b3dbc.png — Reply to this email directly or view it on GitHub https://github.com/Tarobish/Katibeh/issues/79#issuecomment-180063499.

graphicore commented 8 years ago

Can you please make me a live testing page with broken words? One line per word. It would really help me to investigate this further. Also, you say "Dotless Y" is broken, but I see several broken letters. I think I should know all broken letters eventually, to make sure I repair them all...

Tarobish commented 8 years ago

http://tarobish.github.io/Katibeh/html/live-testing.html#?eJwljj0OgkAQha9ithazbAgitTfQytgQRUOCaMgmGo0VIFltbDgDxsYgveeYcS/jLHbzvZ/MO7HVNpHMZ3uHc9ZnMjzIcZSSkMqYONxMomNIKJydJI6DZE1E14J6YVeFVpeYQ405KmhRzZMevHSlb9Cg+tJFAma6wtwAXqCT8Aq1CcATFX5M56EraP4RsgtdwpusDGpjtvoObZct/l9owjKQ3TRuuxYXFvem9sgXrm97g6EtZuz8A/Pxa3k= there are three characters one it has to be "uniFBE8" and "uniFBE9" (alefMaksura-ar.medi and alefMaksura-ar.init) another one its "uniFEEA" or "uniFBA7" (heh-ar.fina or hehgoal-ar.fina)

screen shot 2016-02-08 at 11 29 56 am

سۇلالىسى دەۋرىگە كەلگەندە يارىتىپ بەرگەن مۇزىكا سېستىمىسى

graphicore commented 8 years ago

one it has to be "uniFBE8" and "uniFBE9" (alefMaksura-ar.medi and alefMaksura-ar.init)

These seem to be missing from the font. I can't find them.

another one its "uniFEEA" or "uniFBA7" (heh-ar.fina or hehgoal-ar.fina)

These exist in the font

graphicore commented 8 years ago

From your example, the letter marked in pink here:

selection_050

It is actually encoded as uni06D5 ARABIC LETTER AE (Uighur, Kazakh, Kirghiz)

I will replace it just in fina with: heh-ar.fina (uniFEEA)? On an article at wikipedia about Kazakh I found some backup for that. I was wondering if it also uses the initial and medial forms of Heh, but it doesn't, as it seems.

graphicore commented 8 years ago

I think that the hehgoal-ar.fina (uniFBA7) character that you mention is a red herring.

The isolated form of uniFBA7 is uni06C1 06C1 ARABIC LETTER HEH GOAL. We have substitutions for it in init, medi and fina to uniFBA8, uniFBA9 and uniFBA7 So I think there's no indication that uni06C1 is broken

Tarobish commented 8 years ago

lets try it then :) but we do have them, here they are from UFO screen shot 2016-02-08 at 7 33 27 pm

graphicore commented 8 years ago

Oh I see, I just pulled from github and there they are

graphicore commented 8 years ago

You added them today?

Tarobish commented 8 years ago

Great :) lets finish it

On Feb 8, 2016, at 7:38 PM, Lasse Fister notifications@github.com wrote:

Oh I see, I just pulled from github and there they are

— Reply to this email directly or view it on GitHub https://github.com/Tarobish/Katibeh/issues/79#issuecomment-181691423.

Tarobish commented 8 years ago

Yesterday :) i thought it may be something wrong with characters :) a bit research and then fire :)

On Feb 8, 2016, at 7:38 PM, Lasse Fister notifications@github.com wrote:

You added them today?

— Reply to this email directly or view it on GitHub https://github.com/Tarobish/Katibeh/issues/79#issuecomment-181691690.

graphicore commented 8 years ago

Ah, so, unfortunately we do can't use the features generated by glyphs. So Our feature files need to be hand-updated when new glyphs are added. That's maybe also why uniFEEA is not coming up.

graphicore commented 8 years ago

Bad thing is when we miss making these updates.

Tarobish commented 8 years ago

OH :( should i do this and how?

graphicore commented 8 years ago

What exactly? I am adding the features for this issue at the very moment. But if you add any new glyphs that need features, you should open a new issue to inform me.

If you added new glyphs in the past, it would be cool if you know which ones, so we can update our features.

The features for "uniFBE8" and "uniFBE9" are not created by glyphs anyways, I just checked.

Tarobish commented 8 years ago

i mean adding the features which you are doing it :) I didn't know that, actually i know what glyphs please check these uni08A8, uni08A8.fina, uni08A8.init, uni08A8.medi

graphicore commented 8 years ago

First things first:

selection_051

Tarobish commented 8 years ago

Awesome :) just perfect

graphicore commented 8 years ago

please check these uni08A8, uni08A8.fina, uni08A8.init, uni08A8.medi

They are also not exported by glyphs. It seems also that we should decompose them, so that they will trigger some of our ligatures.

Tarobish commented 8 years ago

No don't do that let them be the way they are

graphicore commented 8 years ago

ok.

graphicore commented 8 years ago

What's up with this one from your initial description?

selection_053

Tarobish commented 8 years ago

Oh, totally forgot that, let me check it

Tarobish commented 8 years ago

Please just replace it with initial form, we do have the characters

graphicore commented 8 years ago

Which name/unicode is the isolated form and which is the final form.

Tarobish commented 8 years ago

It has to be "uni0679.init" and "uni0679.medi" if I'm not mistaken (I'm not sure) and here is the iso and final "uni06BB" and "uni06BB.fina" but it risky i couldn't even find the init and medi from "Noto"

graphicore commented 8 years ago

06BB ARABIC LETTER RNOON • Sindhi

There's also a letter 0679  ARABIC LETTER TTEH • Urdu

That looks alike https://en.wikipedia.org/wiki/%E1%B9%ACe

Wikipedia says:

Some layout engines do not properly generate medial and final forms (which should look like ـٹـ and ﭨ) and will render the isolate form ٹ, without joining.

See the wikipedia page for the example letters and some more infos.

graphicore commented 8 years ago

If it's uni0679 it should be replaced by uni0679.medi and uni0679.fina and uni0679.init. But for the replacements I can also use whatever is the right glyph, like uni06BB.fina

The same for uni06BB, I can replace it with what you suggest if you want me to.

graphicore commented 8 years ago

06BB has a quite good wikipedia article https://de.wikipedia.org/wiki/%E1%B9%86un

it says init should be FBA2, medi should be FBA3 and fina should be FBA1

graphicore commented 8 years ago

Uh, that's the German wikipedia :-D didn't notice.

graphicore commented 8 years ago

Here is the stuff for 0679 TTEH:

FB67  ARABIC LETTER TTEH FINAL FORM ≈ 0679

FB68 ARABIC LETTER TTEH INITIAL FORM ≈ 0679

FB69 ARABIC LETTER TTEH MEDIAL FORM ≈ 0679

selection_054

Tarobish commented 8 years ago

Haha :))) exactly, it was my problem because i found it next to  ARABIC LETTER TTEH :) while i was looking for ARABIC LETTER RNOON :)

Tarobish commented 8 years ago

OK :) if you think its right to replace them, do this please

graphicore commented 8 years ago

Here's RNOON from the unicode PDF

selection_056

graphicore commented 8 years ago

ok can we wrap this up. what should be replaced by what?

Tarobish commented 8 years ago

in the picture we have the isol form it has to be replace by the inti so uni06BB replace by uniFB68 later ill check the pdf to see all its right

graphicore commented 8 years ago

Ok, so it's all about TTEH, was not all clear to me :-)

Tarobish commented 8 years ago

To be honest to me too :) i don't get it, how two letters with the same shape, making different sound :) its like to have letter R and N with the same shapes :)))))

graphicore commented 8 years ago

Yeah, it's kind of stupid. Maybe they don't even make another sound, but that's not how unicode. There is not really a system for unicode encodings I think, they just try to do their best.

SO we have "uni0679.init" and "uni0679.medi". Do you also have the fina form for it? On wikipedia it says TTHE is derived from Ta (062A) So I think we could also make it by decomposition: uni066E.fina + uni0615

Tarobish commented 8 years ago

we already have it by "uni0679" and "uniFB67" for final and isol

graphicore commented 8 years ago

I just found out. I'll use uniFB67 uniFB68 uniFB69 and I guess you don't want to risk having ligatures of these?

Tarobish commented 8 years ago

Yes :) we don't want to risk it