ryanoasis / nerd-fonts

Iconic font aggregator, collection, & patcher. 3,600+ icons, 50+ patched fonts: Hack, Source Code Pro, more. Glyph collections: Font Awesome, Material Design Icons, Octicons, & more
https://NerdFonts.com
Other
52.92k stars 3.59k forks source link

[Suggestion] Fix invalid code points for some glyphs #365

Closed delphinus closed 1 year ago

delphinus commented 4 years ago

Summary

The current builds overwrites some code points that Unicode Consorthium prohibits to use for custom glyphs. I want to change this by fixing font_patcher.

Problem Detail

Unicode defines Private Use Areas and the consorthium itself decide not to add characters on these areas. So we can add glyphs as we like.

The areas has these code points: U+E000..U+F8FF, U+F0000..U+FFFFD, U+100000..U+10FFFD.

But the Nerd Fonts is overwriting more code points than them. The font_patcher writes glyphs as below.

Font Name source current plan 1 plan 2
Seti-UI + Custom E4FA-E52E E5FA-E62E
Devicons E600-E6C5 E700-E7C5 E630-E6F6 E700-E7C5 (not changed)
Powerline Symbols E0A0-E0B3
Powerline Extra Symbols E0A3-E0D4
Pomicons E000-E00A
Font Awesome F000-F2E0
Font Awesome Extension E000-E0A9 E200-E2A9
Power Symbols 23FB-2B58
Material F001-F847 F500-FD46 E700-EF47 F500-F8FF,E800-EC47
Weather Icons F000-F0EB E300-E3EB
Font Logos (Font Linux) F100-F11C F300-F31C
Octicons F000-F105 F400-F505
Octicons 2665-2665
Octicons 26A1-26A1
Octicons F27C-F27C F4A9-F4A9

It is the problem that font_patcher writes glyphs from Material into the range: U+F500 - U+FD46. This range overlaps areas that should not use for such purpose.

area name
U+F900..U+FAFF CJK Compatibility Ideographs
U+FB00..U+FB4F Alphabetic Presentation Forms
U+FB50..U+FDFF Arabic Presentation Forms-A

Suggestion

So I suggest two plans to solve this.

plan 1

  1. Move Devicons just after Seti-UI + Custom.
  2. Move Material to U+E700..U+EF47.

plan 2

  1. Move and separate Material into U+F500..U+F8FF, U+E800-U+EC47.

I prefer plan 2 because it has less impact on the current builds. How do you think?

ryanoasis commented 4 years ago

Thanks a lot @delphinus . I appreciate the thought you put into this.

To be completely honest I wasn't careful enough to be sure the ranges remained within the Private Use Areas. We should definitely make sure to stay within the PUAs going forward :blush:

I think I like plan 2 or some variation of it as well. I would prefer plan 1 for something that wouldn't be a major release (e.g. 2.x.x).

We need to decide if we are following semver strictly or not, if strictly then this would technically be a breaking change either way and would require us to version as 3.x but at the same time it would "feel" wrong to bump to version 3 if this was the only major change.

Good plans and suggestions here and I am mostly in agreement with you. I am wondering about versioning and what exactly the ranges should/could be :smiley:

delphinus commented 4 years ago

Thanks for agreement. The changes will be a breaking change indeed. But for releasing v3.0.0, the “CHANGES” might be so less than the ones in v2.0.0. 🤔

ryanoasis commented 4 years ago

Yeah, your recommendation makes complete sense and obvious some changes need to happen.


But for releasing v3.0.0, the “CHANGES” might be so less than the ones in v2.0.0. thinking

Sorry, I didn't quite follow, can you elaborate?

delphinus commented 4 years ago

I see there are a lot of changes in v1.1.0 → v2.0.0. But if you use v3.0.0 this time, v2.0.0 → v3.0.0 will have 1 diff (this issue) only. I was just curious. ;)

ryanoasis commented 4 years ago

I get you now. I am thinking 3.0 would have this change and many others: update to material design, and other icon additions and programming fonts. I would like to do a 2.1 release soon, next one might be a big that 3.0 major.

aaronbell commented 4 years ago

Can you provide an update regarding this issue? That the Material Design set overrides non-PUA codepoints is a significant issue.

ryanoasis commented 4 years ago

@aaronbell Unfortunately the update I am going to give is not what you want to hear: There really are no updates on any forward momentum on moving the Material Design codepoints. However, I am back at trying to get a release out, after that this is likely one of the top priorities. This will be a breaking change as far as Nerd Fonts goes so it would be a 3.0 release. I agree it is a significant issue.

Hope that helps.

aaronbell commented 4 years ago

@ryanoasis Thanks Ryan. Too bad. Unfortunately, I will not be adding full Nerd Fonts support to Cascadia Code until v3.0 is released as I am unwilling to include Material Design icons in their current location. International interoperability takes precedent.

ryanoasis commented 4 years ago

@aaronbell I completely understand that position. Absolutely international support takes precedent as it should. The whole code points seeping outside of PUA is a big mistake on my part.

While there has been no momentum of the fix in terms of code I did start to group tasks under a new 3.0 milestone and there definitely is a pressure to make it right.

Thanks for your valuable input and straightforwardness.

trallnag commented 2 years ago

How difficult is it to move glyphs to another location? I guess the font patcher needs to be adjusted?

Finii commented 2 years ago

https://github.com/ryanoasis/nerd-fonts/issues/365#issuecomment-519779578 (@ryanoasis)

I think I like plan 2 or some variation of it as well. I would prefer plan 1 for something that wouldn't be a major release (e.g. 2.x.x).

I believe changing the range two times is not something anybody wants. (If I read that correctly.)

Maybe the way how this shall be rolled out needs to be formally fixed. The changes themselves are trivial.

I think there are additional possible plans:

Plan 1

Devicons move to E630 - E6F6 Material move to E700-EF47 Codicons needs to vacate EA60 - EBEB

Pro: Material in one block Con: Material displaces Codicons (will they be useful with VS?) Con: No space for future expansion of Seti + Custom Con: No smooth transition a la Plan Plus possible

Plan 2

Material split and move to F500 - F8FF and E800 - EC47 Codicons needs to vacate EA60 - EBEB

Pro: A lot Material codepoints unchanged Con: Material displaces Codicons (will they be useful with VS?) Con: Material is split (is that really an issue?)

Plan 3

Material split and move to F500 - F8FF and E900 - ED47 Codicons needs to vacate EA60 - EBEB

Pro: A lot Material codepoints unchanged, only one digit changes in moved codepoints Con: Material displaces Codicons (will they be useful with VS?) Con: Material is split (is that really an issue?)

Plan 4

Material move to FF500 - FFD46

Pro: Material in one block, only one digit changes in moved codepoints Con: Use of codepoints above FFFF (is that really an issue?)

Plan 5

Material move to F0001 - ...

Pro: Material in one block, on new original codepoints Pro: Lots of space for Material expansion Con: Use of codepoints above FFFF (is that really an issue?)

Additional Plan Plus

No matter which plan is decided on, I believe we should act on it now, and not wait until a next release. Specifically it would be beneficial if the NEW destination codepoints are filled additionally with the glyphs already now, so people have a change to adjust their setups. Not only in release 2.2.0 or even worse 3.0.0.

In a second step, after a (major, see semver) release, the obsolete codepoints can be dropped.

That would result in a more smooth transition path. It also means that (at least part of) Material exists two times in the patched fonts.

Edit: Mention overlooked Codicons and add Plan 5

Finii commented 2 years ago

@wismill https://github.com/ryanoasis/nerd-fonts/pull/609#issuecomment-978943902 also mentions Plan Plus.

earboxer commented 2 years ago

@Finii

It also means that (at least part of) Material exists two times in the patched fonts.

If we're okay with material existing twice over, we could take the newest version of Material Design Icons, and put them where their current codepoints have put them (0F0001 - 0F19C3). This will have the benefit of being easy to update (they're adding icons very frequently), and being a continuous block, and of course, having all our largest icon-packs in their canonical locations (I haven't done a history check, but thinking that @Templarian is not likely to do the code point shifting thing again).

Finii commented 2 years ago

I just noticed, thanks to @delphinus , that the original table in the top is outdated. Thanks for the updated table in https://github.com/delphinus/homebrew-sfmono-square/issues/67

After we included Codicons (#705) (0xEA60 - 0xEBEB) Plan 1 & 2 & 3 became impossible (or at least... we would need to move Codicons first :unamused:)

image

I will update my Plan List above accordingly.

With this maybe @earboxer's idea https://github.com/ryanoasis/nerd-fonts/pull/772#issuecomment-1023173171 for 0xF0001 gets rather interesting. I will list this as Plan 5 above.

Edit: Before only Plan 2 and 3 were mentioned, but 1, 2, and 3 are affected!

Arnie97 commented 2 years ago

Plan 4 Pro: Material in one block, only one digit changes in moved codepoints

Apple SF Symbols has occupied the first 3,300+ codepoints in Plane 16 (Supplementary Private Use Area B, U+100000-U+10FFFF). So if Plan 4 was chosen, we will need another block soon when the Material Icons exhausted Plane 15 (U+FFFFF). Plan 5 looks more future-proof.

aaronbell commented 1 year ago

So... it has been two years since my last inquiry on this. Is there a finalized decision for the location of Material Design icons that don't override other unicode slots?

Finii commented 1 year ago

@aaronbell Unfortunately .. no. Ryan's comments above were the last time he has been seen here, regrettably. Recently I started to push on with releases: Ryan's initialted 2.2.0 and at the moment 2.3.0 which is intended to update a lot of the source symbols and maybe fonts. Ryan envisioned the codepoint change to 3.0.0, breaking as it is.

TL;DR:

I believe most arguments point to Material at its original location, which is F0001 - F1AF0 (currently).

-=> Plan 5 a.k.a #773

Secretly I wonder ... is it really worth to add another 7,000 glyphs? Where will it end? ;-)

If you think this is the way to go, I will do it here with that points.

Well, excuse me,

Unrelated, maybe I can ask you @aaronbell for some information?

People over at Fontforge (once) believed that Windows has an additional length limit on the font FamilyName (writeup by me here). But I can not find that anywhere on Microsoft's Typography websites, and additionally I use fonts with longer names on Windows (10) with no problems. Is that really an issue, or was that like a Windows 3.1 problem? Or Windows 7, or MS-Word 5 with a too-narrow font-name pulldown?


Below the line (here) is just data I collected to come to the conclusion:

But, as @earboxer suggested, for a smoother codepoint transition we (Nerd Font) could introduce the 'new codepoints' additionally to the current ones, so that 2.3.0 contains old and new codepoints. For that reason it is exactly the right moment now to decide on that. I would expect 2.3.0 in October.

Please let me (again) tabularize data:

Glyph set original location now and after updating (sorted by current dest)

Glyph set current start current length update start update length update codepoint stable? comment current destination source
Pomicons E000 10 - no update, [4] E000 https://github.com/gabrielelana/pomicons
Font Awesome Ext E000 170 - no update, [3] E200 https://github.com/AndreLZGava/font-awesome-extension
Weather F000 236 F000 222 probably E300 https://github.com/erikflowers/weather-icons
Seti + our E4FA 59 E4FA ~175 yes [0] E5FA https://github.com/jesseweed/seti-ui
Devicons E600 198 E600 ~500 no [1] E700 https://github.com/devicons/devicon
Codicons EA60 396 EA60 430 yes EA60 https://github.com/microsoft/vscode-codicons
Font Awesome F000 737 ? 213 + 515 ? no [2] F000 https://github.com/FortAwesome/Font-Awesome
Font Logos F300 48 - - yes already updated F300 https://github.com/Lukas-W/font-logos
Octicons F000 262 ? 515 no F400 https://github.com/primer/octicons
Material F001 2119 F0001 6896 unknown F500 https://github.com/Templarian/MaterialDesign-Font

[0] Codepoints allocated by us [1] Update scattered and lots of icons unusable for fonts [2] Current release scattered and split into multiple font files [3] Maybe obsolete, check extension glyphs for duplicates [4] Maybe obsolete? Codepoints clash sometimes with original font's ligatures etc

Edit: Add link to PR

aaronbell commented 1 year ago

@Finii Ah well, that explains that.

Secretly I wonder ... is it really worth to add another 7,000 glyphs? Where will it end? ;-)

Might want to ask Unicode about their decision to include Emoji :). I fear there will always be new icons to add or symbols that people want. All you can do is go along with it, or say, "NO MORE!"

People over at Fontforge (once) believed that Windows has an additional length limit on the font FamilyName.

I don't think there's any documentation on it on the Microsoft typography website (it isn't really spec related). There's some info / investigation here worth reading: https://github.com/googlefonts/fontbakery/issues/2179 tldr: It is primarily a legacy issue in certain applications and situations, but appears to crop up in unexpected places, so the recommendation is to keep less than 29 characters long to avoid any problems.

2.3.0

Wow! It looks like there are a lot of new glyphs being added across the full Nerd Fonts set. I expect that if these glyph sets continue to expand, it'll make things challenging to keep them separate. Not to mention needing to organize and create a 'master' version of the codepoints. Phew!

If I understand the chart that you've included, the current plan (plan 5) is to essentially move Material design to F0001 where it can expand freely as necessary. It sounds like you're also planning to preserve the existing location for the time being until the breaking 3.0.0 change.

Anyway, that plan works for me. I've been circling back to investigating native support for Nerd Fonts again (finally), and having a Unicode-compliant solution is great.

I look forward to the finalized locations for everything! Let me know if I can be of assistance.

Finii commented 1 year ago

@aaronbell

Thank you for the information on name length. This is very much appreciated.

2.3.0

Plan 5 is essentially adding the current Material Design Icons at their native codepoints (i.e. F0001 - F1AF0). The current / old Material Design Icons (renamed to 'legacy') are kept in the problematic regions, and be removed with 3.0.0. This shall make the transition of codepoints easier for the users. I worked on the relevant PR #773 today - there is a small scale-translate problem I'd like to solve before merging. (I.e. .you understood that perfectly right.)

a lot of new glyphs

Material is exploding, but now has room to 'do its thing'. I'm not really sure it does make sense to add it at all. Today's terminal emulators often do a good job with glyph rescaling; and putting the Material Design Desktop font somewhere and rely on font fallback should be a good solution for most people.

From the other sets it is only Octicons and Devicons that really grow. Both without stable codepoints over updates :unamused: But I seem to have started a codepoint discussion with Devicons; their web-user centric view has to be expanded ;)

Finii commented 1 year ago

@aaronbell

tldr: It is primarily a legacy issue in certain applications and situations, but appears to crop up in unexpected places, so the recommendation is to keep less than 29 characters long to avoid any problems.

I'm so relieved, that Cascadia Code does also violate that and not only us ;-} For example CascadiaCodePL-ExtraLightItalic.ttf (2111.01) has 34 chars in ID.4.

And while we have code to limit the length of ID.1 and ID.2 (albeit half broken), we do not use the same abbreviations in ID.4 and almost all fonts (also the 'Windows Compatible' ones) have very long full-names. Noone ever complained, so maybe we can ignore the MS-Word-2011s and IE9s out there. :grimacing:

(But sorry this should not be discussed in this issue.)

Finii commented 1 year ago

Related #813

ppwwyyxx commented 1 year ago

@glepnir this is the nerd-font bug that causes https://github.com/kovidgoyal/kitty/issues/5415

Finii commented 1 year ago

@glepnir this is the nerd-font bug that causes kovidgoyal/kitty#5415

@ppwwyyxx Yes. We try to correct that long standing issue (place symbols where Chinese and other glyphs should be) soon (everything has been prepared for the fix already). Sorry that it arose at all.

But maybe a question about kitty. From what @kovidgoyal writes it seems kitty has a workaround that replaces the (erroneous) symbols with the correct Chinese glyphs?

If so, where do the glyphs come from? How does kitty decide if that is a legitimate Chinese font with Nerd Font symbols patched in (a future version that leaves the Chinese glyphs intact) and uses the font-encoded glyphs, or use some other fallback font/glyph? Or is this a setting? Often kitty users also raise Issues here, so I would like to understand that part of kitty a bit better.

Thank you :-)

kovidgoyal commented 1 year ago

In kitty glyphs come from fonts, which font is chosen depends on the system font libraries (fontconfig/CoreText). First the main font specified for kitty is tried, if that does not have the glyph, then the system is queried for a fallback. This is the same as in most applications.

However, kitty has a feature called symbol_map which allows users to instruct it to load glyphs for the specified code point from a particular font. Many kittys users use this feature to work with NERD font symbols.

Finii commented 1 year ago

@kovidgoyal Wow, thank you for the instantaneous answer!

Again let me apologize that we introduce this issue for a lot of users at all. I hoped to get the corrected fonts out by the end of this year (i.e. v3.0.0) but at the moment I'm a bit backlogged.

kovidgoyal commented 1 year ago

On Thu, Nov 17, 2022 at 12:21:28AM -0800, Fini wrote:

@kovidgoyal Wow, thank you for the instantaneous answer!

Again let me apologize that we introduce this issue for a lot of users at all. I hoped to get the corrected fonts out by the end of this year (i.e. v3.0.0) but at the moment I'm a bit backlogged.

No worries, we are all busy :)

delphinus commented 1 year ago

@Finii #773 only adds Material Icons to the new places, but it still has the original glyphs on invalid code points (out of PUA), it seems.

Then #773 has not solved this issue. This issue should be completed when all glyphs placed on the original places will be removed, don't you think?

Finii commented 1 year ago

Yes, according to plan this will come (glyphs removed) with v3.0.0, this was just one necessary intermediate step.

JanDeDobbeleer commented 1 year ago

Now that this is moving into the non Basic Multilingual Plane, it would be useful to also have the UTF-16 notation for these icons in the cheat sheet next to hex as that one can't be used directly in JSON or other text files. For example, nf-md-folder hex value is f024b which can't be used as \uf024b unlike before. From a user perspective that's not very accessible, so having the UTF-16 notation (\udb80\ude4b) in the sheet would be very useful.

Finii commented 1 year ago

Release is in repo, release as packages pending.

github-actions[bot] commented 8 months ago

This issue has been automatically locked since there has not been any recent activity (i.e. last half year) after it was closed. It helps our maintainers focus on the active issues. If you have found a problem that seems similar, please open a new issue, complete the issue template with all the details necessary to reproduce, and mention this issue as reference.