Some character sequences get merged

adobe-fonts / source-code-pro

Monospaced font family for user interface and coding environments

https://adobe-fonts.github.io/source-code-pro/

SIL Open Font License 1.1

19.81k stars 1.62k forks source link

Some character sequences get merged #192

Open Pantalaim0n opened 6 years ago

Pantalaim0n commented 6 years ago

I've noticed some weird behaviour when typing a certain sequence of characters. They get squeezed to the width of one single character.

It happens when I type lowercase L, U+00B7 (middle dot), lowercase L Using both uppercase L also shows this issue.

I began to notice it when I enabled 'Show non-printable characters' in Netbeans, which uses U+00B7 as a space character. merged_netbeans

I could also reproduce this issue in Notepad: merged_notepad

I haven't found any other sequence that behaves the same.

frankrolf commented 6 years ago

This is a character sequence common in Catalan (e.g. paral·lel), and the behavior is triggered here: https://github.com/adobe-fonts/source-code-pro/blob/b47764205aa316b2c172dc37921526dc45153e0a/Roman/familyGSUB.fea#L210-L211

It seems to be a bug in Netbeans to merge the layer of printable characters with the layer of non-printable characters by using an OpenType feature across them.

moyogo commented 6 years ago

That sub l periodcentered l by lcat; and the one for Lcat are in lookup GLYPH_COMPOSITION_LATIN_NONCONTEXTUAL which is in feature ccmp and in DFLT dflt. The ccmp feature is on by default, and this substitution is on in all languages.

This seems like Netbeans is just applying what is defined in the font.

moyogo commented 6 years ago

While Netbeans shouldn’t use the U+00B7 for displaying the space character to avoid the issue here, it’s a bit odd to expect that. One may use U+00B7 between two Ls and not expect a Catalan geminated L in some cases.

frankrolf commented 6 years ago

This seems like Netbeans is just applying what is defined in the font.

That is correct, but Netbeans is also effectively breaking any other sequence-based OT feature through inserting U+00B7 into the character sequence.

What would you suggest? I can see the pragmatic approach (ccmp) being more accessible to Catalan users, rather than hiding the feature behind language tagging, which we’ve done for other fonts.

escobera commented 3 years ago

This issue is happening with VS Code as well

frankrolf commented 3 years ago

Can you bring that up with VS Code?

escobera commented 3 years ago

Someone already did here: https://github.com/microsoft/vscode/issues/106583

Thanks