n8willis / opentype-shaping-documents

Documentation of OpenType shaping behavior
171 stars 14 forks source link

Malayalam: Update script specific details #99

Closed rajeeshknambiar closed 3 years ago

rajeeshknambiar commented 4 years ago

The Halant/Virama character in Malayalam (് / chandrakkala) is actually above-base unlike Devanagari etc. Remove the reference to below-base "vowel killer" sign.

n8willis commented 4 years ago

Gah; thought I had caught all of those! Thanks for raising this.

Do you feel positive that this is the only spot in the Malayalam document needing such a change? We propagated an initial seed text (for consistency) in the Indic2 langs. If anything else slipped through, it might be good to roll it into the same PR. I know it can be kind of numbing to read the whole thing; I just want to double check....

rajeeshknambiar commented 4 years ago

Give me a couple more days to go through entire document once again.

rajeeshknambiar commented 4 years ago

@n8willis So, I have added a number of fixes (mostly removal of non-applicable text) till section 3. Please take a look and let me know what you think.

I'll try to finish the rest some-time-soon™.

rajeeshknambiar commented 4 years ago

Ah, I thought it was indicated that I proof read the part completely and these changes are finished; but I didn't inform that. @n8willis Can you take a look and merge if all look correct?

n8willis commented 4 years ago

No problem at all; will have a look now.™

rajeeshknambiar commented 4 years ago

Image looks good. Did you by any chance save the commands / process used to generate it? We are trying to collect those in https://github.com/n8willis/opentype-shaping-documents/blob/master/images/malayalam/malayalam-image-generation-log.md (to make the images reproducible)

I can add the details. But as rightly noted there, NotoSerif doesn't have blwf lookup assigned to the below base la glyph (though it is present in conjuncts). blwf form is available in SMC fonts, which I used to generate the image. I guess using a different font in that document is all right?

n8willis commented 4 years ago

I can add the details. But as rightly noted there, NotoSerif doesn't have blwf lookup assigned to the below base la glyph (though it is present in conjuncts). blwf form is available in SMC fonts, which I used to generate the image. I guess using a different font in that document is all right?

Yeah. It's hypothetically nicer to pull all the images in a page from one font for a consistent look, but when that's not possible, it's not possible. Only requirement is that we use a font that's open source / libre / etc so that there are no usage restrictions. [edit: no restrictions for people rebuilding the docs, I mean]

I am interested in your take on whether the no-lookup below-base La ought to be considered a Noto bug. There are a couple of spots where the feature being "demonstrated" is actually implemented in a different -- perhaps even confusingly different -- feature lookup ...because it happened that way in the font. I think those are all noted in the image-generation logs (they're supposed to be; I tried). In a lot of those cases it's probably impossible to know if it was done that way as a totally-understandable workaround, or it was a strange default in the font editor, or complete accident.

The images are in there because I'm of the opinion that the illustration ought to show the desired feature, for educational value, even if it's implemented in a strange or hacky workaround inside of the actual font file. We can always find better fonts and replace the images in the future -- but if this is a bug we could report against Noto and maybe get a fix for, that's even better.

rajeeshknambiar commented 4 years ago

Yeah. It's hypothetically nicer to pull all the images in a page from one font for a consistent look, but when that's not possible, it's not possible. Only requirement is that we use a font that's open source / libre / etc so that there are no usage restrictions. [edit: no restrictions for people rebuilding the docs, I mean]

SMC fonts are under OFL, so they fit the bill well.

I am interested in your take on whether the no-lookup below-base La ought to be considered a Noto bug. There are a couple of spots where the feature being "demonstrated" is actually implemented in a different -- perhaps even confusingly different -- feature lookup ...because it happened that way in the font. I think those are all noted in the image-generation logs (they're supposed to be; I tried). In a lot of those cases it's probably impossible to know if it was done that way as a totally-understandable workaround, or it was a strange default in the font editor, or complete accident.

I can’t speak for Noto, but — it would be hard to consider not implementing standalone bwlf form of "La" a bug, especially when it forms the right shape when combined with consonants. The reason I say is, we have had a number (still have a few) of bugs due to blwf form of "La" as it is formed unconditionally; which is worked around by complex contextual substitution rules in the font. And in the new font shaping rules I am working on for past 8 months, I have decided not to implement bwlf "La" either (noticed the implementation of Noto only after you mentioned it though).

That said, it could be treated as a bug if the combination "NBSP, ZWJ, Virama, La" doesn't form the blwf form of La; as it is explicitly mentioned in the OpenType specification (see section "Effect of ZWJ, ZWNJ and NBSP on Consonant Shaping" in https://docs.microsoft.com/en-us/typography/script-development/malayalam). Otherwise, for education and various other purposes, it becomes impossible to generate that glyph from a particular font.

image

WDYT?

rajeeshknambiar commented 4 years ago

@n8willis The last couple of commits hopefully address your feedback. Please check (particularly, that I didn’t miss/remove any content) and let me know.

n8willis commented 4 years ago

WDYT?

Well, I am not qualified to advise on whether or not you should open an issue on Noto — your time is yours of course, and I don't have the script knowledge. Not being able to generate that example feels odd, but whether or not it really matters is a judgement call.

From the perspective of this repo, it might be educational to read what sort of discussion resulted from such a bug report, considering that Noto often occupies a "go-to example" position due to its perception as being up-to-date technically and usually well-designed among FOSS fonts.

I suppose I would say that if you decide posing the issue to the Noto team just to see what happens sounds worth it, then either tag this PR in the issue, or vice-versa, so it would be possible to track the connection. But don't feel any pressure to spend more time on it than that.

rajeeshknambiar commented 4 years ago

From the perspective of this repo, it might be educational to read what sort of discussion resulted from such a bug report, considering that Noto often occupies a "go-to example" position due to its perception as being up-to-date technically and usually well-designed among FOSS fonts.

Noto initially had blwf La implemented for both script tags mlym and mlm2, and still has the substitution for mlym. Surprisingly, hb-view option --script takes only ISO-15924 script tag, not OpenType script tags; so hb-view only applies mlm2 substitutions (if a font contains both script tags, that is).

For the time being, I’ll add the image generation commands using SMC Rachana.

n8willis commented 4 years ago

Surprisingly, hb-view option --script takes only ISO-15924 script tag, not OpenType script tags; so hb-view only applies mlm2 substitutions (if a font contains both script tags, that is).

Hmm. I think I may open an issue on HB about that.

n8willis commented 4 years ago

Surprisingly, hb-view option --script takes only ISO-15924 script tag, not OpenType script tags; so hb-view only applies mlm2 substitutions (if a font contains both script tags, that is).

Hmm. I think I may open an issue on HB about that.

Okay; actually it turns out that there is a workaround (see here for the discussion). Evidently you can pass an override prefix that bypasses the ISO stuff and goes straight to HB internals. The syntax would be --script=x-hbscmlym.

But I haven't tested it yet; need to update this machine to the latest release anyway. And I should probably add that override info to the HB docs.

rajeeshknambiar commented 4 years ago

Okay; actually it turns out that there is a workaround (see here for the discussion). Evidently you can pass an override prefix that bypasses the ISO stuff and goes straight to HB internals. The syntax would be --script=x-hbscmlym.

I have harfbuzz-2.6.4, which doesn't seem to support either --script=x-hbsc... or --language=x-hbsc.... I'll try to update to 2.7.0 and check.

How about the remaining changes? Let me know once you review.

n8willis commented 3 years ago

Okay; actually it turns out that there is a workaround (see here for the discussion). Evidently you can pass an override prefix that bypasses the ISO stuff and goes straight to HB internals. The syntax would be --script=x-hbscmlym.

I have harfbuzz-2.6.4, which doesn't seem to support either --script=x-hbsc... or --language=x-hbsc.... I'll try to update to 2.7.0 and check.

Sorry for taking so long; studies taking up lots of mental cycles. I did continue trying to sort out why the override stuff in hb-view does not work here and unfortunately I can't make sense of it. IF you want to give one more go at that bit of it, it would certainly be useful to figure out how to make the overrides work. But I would also 100% understand if you don't feel like it.

Latest I can tell you there is that there's some more detail in HarfBuzz issue 2680, and it might should be something that could be explored with Simon Cozens' "crowbar" tool: https://github.com/simoncozens/crowbar / http://www.corvelsoftware.co.uk/crowbar/

But I could not get the Noto blwf lookup to work for mlym.... I suppose that could also be because of some odd interaction of the other GSUB lookups in the font ... or because of a limitation in crowbar ... or of a limitation in harfbuzzjs -- but here again "time spent" is going way up and "what we get out of it" is shrinking.

rajeeshknambiar commented 3 years ago

But I could not get the Noto blwf lookup to work for mlym.... I suppose that could also be because of some odd interaction of the other GSUB lookups in the font ... or because of a limitation in crowbar ... or of a limitation in harfbuzzjs -- but here again "time spent" is going way up and "what we get out of it" is shrinking.

I did try to manually build hb 2.7.0 and upgrade then, but that broke LibreOffice and had to revert. Now I have 2.7.2 system package, but got quite busy. Will try to get to this soon-ish.

rajeeshknambiar commented 3 years ago

I did try to manually build hb 2.7.0 and upgrade then, but that broke LibreOffice and had to revert. Now I have 2.7.2 system package, but got quite busy. Will try to get to this soon-ish.

Still couldn’t get --script=x-hbscmlym to work, I have asked at https://github.com/harfbuzz/harfbuzz/issues/495

n8willis commented 3 years ago

Well ... I just (belatedly) decided to extract the lookups as .fea code, thinking something hidden in there could be traced manually, and it gives us this:

lookup blwfBelowBaseFormsinMalaylamlookup7 {
  lookupflag 0;
    sub \lamlym \viramamlym  by \lasubscriptmlym;
} blwfBelowBaseFormsinMalaylamlookup7;

lookup blwfBelowBaseFormsinMalaylamlookup8 {
  lookupflag 0;
    sub \lasubscriptmlym by \viramamlym \lamlym ;
} blwfBelowBaseFormsinMalaylamlookup8;

So lookup 7 activates the subscript, and lookup 8 reverses it. I cannot fathom why it would be done that way....

Let's drop Noto Malayalam for SMC.

rajeeshknambiar commented 3 years ago

@n8willis Long due, but the blwf La image generation is now added. Also merged the master branch to fix a conflict. Please review at your convenience.

n8willis commented 3 years ago

Cool; thanks for this. Will take a look shortly.

On Wed, Apr 14, 2021 at 7:42 AM Rajeesh K Nambiar @.***> wrote:

@n8willis https://github.com/n8willis Long due, but the blwf La image generation is now added. Also merged the master branch to fix a conflict. Please review at your convenience.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/n8willis/opentype-shaping-documents/pull/99#issuecomment-819270898, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQXGBGJJUZNIGW4LNYNOVDTIU2M5ANCNFSM4NJGOVGA .

rajeeshknambiar commented 3 years ago

A gentle reminder.

n8willis commented 3 years ago

Yes; taking a look again now. 😬. Thanks for your patience; I had some deep untangling of Indic2 issues I want to get pinned down.