Feature Request: Specifying 8-bit Character Encoding (ASCII or ISO 8859-x) used when generating glyphs

This is a feature request which is indispensible when working with international (Latin) languages.

The current implementation of FreeFontConverter converts the characters at Unicode code points 0x20 (Space) to 0xFF into bitmaps in a header file.

ASCII

Many English language applications will not require/use glyphs outside the ASCII range (0x20 ~ 0x7F), so I propose adding a runtime argument specifying ASCII (replacing CHARMAP_LAST_CHAR by 0x7F) thus halving the memory footprint.

Example:

$ freeFontConverter --font=OpenSansRegular.ttf --encoding=ASCII

Character Encoding

When working with 8-bit character sets, a range of encodings exist that convert a 8 bit numeric character values into a (Unicode) glyph, supporting the requirements for a wide range of languages. The lower half (0x00 ~ 07F) is mapped to the standard ASCII character set, while the upper half (0x80 ~ 0xFF) varies according to the encoding.

Without support for such character encodings, the upper half of the character set (using unicode code points 0x80 ~ 0xFF will be ISO/IEC 8859-1 (equality mapping), which excludes support for a lot of languages/translations (se below).

I therefore propose adding a runtime argument for specifying an 8-bit character encoding, and mapping the 8-bit character code to a (126 bit) Unicode glyph before generating the bitmapped font.

Example:

$ freeFontConverter --font=OpenSansRegular.ttf --encoding=8859-15

The most commonly used (universal) character encoding for Latin languages is ISO/IEC 8859-15 (superceding ISO 8859-1), which should be used as the default value. A pseudo ASCII mapping could be generated by only using character codes 0x20 ~ 0x7F of the ISO/IEC 8859-15 mappings.

I propose adding support for ISO/IEC 8859-15, and possibly the remainder of the ISO/IEC 8859 encodings.

Commonly Used 8-bit Character Encodings

ISO/IEC 8859 8-bit character encodings
- Mapping files specifying the 8-bit to Unicode Glyph mappings for 8859-n encodings
- 8859-1 Latin-1, Western European
  Covering most Western European languages
- Danish (partial)
- Dutch (partial)
- English
- Faeroese
- Finnish (partial)
- French (partial)
- German
- Icelandic
- Irish
- Italian
- Norwegian
- Portuguese
- Rhaeto-Romanic
- Scottish Gaelic
- Spanish
- Catalan
- Swedish
- 8859-2 Latin-2, Central European
  Supports those Central and Eastern European languages that use the Latin alphabet
- Bosnian
- Polish
- Croatian
- Czech
- Slovak
- Slovene
- Serbian
- Hungarian
- 8859-3 Latin-3, South European
- Turkish
- Maltese
- Esperanto
- 8859-4 Latin-4, North European
- Estonian
- Latvian
- Lithuanian
- Greenlandic
- Sami
- 8859-5 Latin/Cyrillic
  Covers mostly Slavic languages that use a Cyrillic alphabet
- Belarusian
- Bulgarian
- Macedonian
- Russian
- Serbian
- Ukrainian
- 8859-6 Latin/Arabic
- Arabic
- 8859-7 Latin/Greek
- Greek
- 8859-8 Latin/Hebrew
- Hebrew (as used in Israel)
- 8859-9 Latin-5, Turkish
  Largely the same as ISO/IEC 8859-1, replacing the rarely used Icelandic letters with Turkish ones
- Turkish
- 8859-10 Latin-6, Nordic
  A rearrangement of Latin-4. Considered more useful for Nordic languages
- Danish
- Finnish
- Norwegian
- Swedish
- 8859-11 Latin/Thai
- Thai
- ~~8859-12 Latin/Devanagari~~ The work in making a part of 8859 for Devanagari was officially abandoned in 1997
- 8859-13 Latin-7, Baltic Rim
  Added some characters for Baltic languages which were missing from Latin-4 and Latin-6
- Estonian
- Latvian
- Lithuanian
- 8859-14 Latin-8, Celtic
- Celtic
- Gaelic
- Breton
- 8859-15 Latin-9
  A revision of 8859-1 that removes some little-used symbols, replacing them with the euro sign € and the letters Š, š, Ž, ž, Œ, œ, and Ÿ
- Danish
- Dutch
- English
- Estonian
- Faeroese
- Finnish
- French
- German
- Icelandic
- Irish
- Italian
- Norwegian
- Portuguese
- Rhaeto-Romanic
- Scottish Gaelic
- Spanish
- Catalan
- Swedish
- 8859-16 Latin-10, South-Eastern European
- Albanian
- Croatian
- Hungarian
- Italian
- Polish
- Romanian
- Slovene
Microsoft Code Pages (CPnnn, Windows/DOS )
Mapping files for all encodings

rochaferraz / FreeFontConverter