Open belegdol opened 2 years ago
Doing so would require quite a bit of work. The lack of a real font server is an issue on Linux. It pushes stuff like font substitution for missing characters into applications, and drives up memory consumption and startup time because every application needs logic for choosing appropriate fonts for fallback.
That aside, the whole OSD font module interface needs reworking to render text runs. It’s currently completely impossible for languages like Thai or Khmer (Cambodian) to render in MAME at all.
Thanks for explaining. I don't suppose http://sdlpango.sourceforge.net/ is of any help? There have been some attempts to port it to SDL2 it seems: https://sourceforge.net/p/sdlpango/bugs/7/.
While I do not know what is preventing Thai or Khmer text from being rendered, it appears that there has been some work in SDL2 regarding rendering ligatures and substitutions: https://github.com/libsdl-org/SDL_ttf/issues/62
Given that a comprehensive solution is not on the horizon, would switching to DejaVu Sans be a possibility? I tried a bunch of fonts installed on my system and it was the only one apart from Adobe Source Code Pro which had all the symbols needed for vgmplay. Source Code Pro is shipped as OpenType on Fedora though so it does not work as a simple substitution.
How widespread is DejaVu Sans in Linux distributions, and what’s its coverage like for Latin script languages? Various things in MAME no longer use plain ASCII for the main description.
FWIW, the best way to address issues with vgmplay specifically would probably be to embed SVG in the layout, now that we support that.
Re: coverage, see below. Coverage of DejaVu Sans seems to be generally better. I used https://github.com/abelcheung/font-coverage to generate the data.
=== Liberation Sans
Basic Latin (U+0020-U+007F) => 95 / 95 / 0
Latin-1 Supplement (U+00A0-U+00FF) => 96 / 96 / 0
Latin Extended-A (U+0100-U+017F) => 128 / 128 / 0
Latin Extended-B (U+0180-U+024F) => 208 / 208 / 0
IPA Extensions (U+0250-U+02AF) => 96 / 96 / 0
Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 80 / 0
Combining Diacritical Marks (U+0300-U+036F) => 112 / 112 / 0
Greek and Coptic (U+0370-U+03FF) => 135 / 127 / 0
Cyrillic (U+0400-U+04FF) => 256 / 256 / 0
Cyrillic Supplement (U+0500-U+052F) => 48 / 24 / 0
Hebrew (U+0590-U+05FF) => 88 / 87 / 0
Phonetic Extensions (U+1D00-U+1D7F) => 128 / 128 / 0
Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 64 / 0
Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 13 / 0
Latin Extended Additional (U+1E00-U+1EFF) => 256 / 247 / 0
Greek Extended (U+1F00-U+1FFF) => 233 / 233 / 0
General Punctuation (U+2000-U+206F) => 111 / 57 / 0
Superscripts and Subscripts (U+2070-U+209F) => 42 / 22 / 0
Currency Symbols (U+20A0-U+20CF) => 33 / 23 / 0
Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 1 / 0
Letterlike Symbols (U+2100-U+214F) => 80 / 9 / 0
Number Forms (U+2150-U+218F) => 60 / 7 / 0
Arrows (U+2190-U+21FF) => 112 / 8 / 0
Mathematical Operators (U+2200-U+22FF) => 256 / 18 / 0
Miscellaneous Technical (U+2300-U+23FF) => 256 / 4 / 0
Box Drawing (U+2500-U+257F) => 128 / 40 / 0
Block Elements (U+2580-U+259F) => 32 / 8 / 0
Geometric Shapes (U+25A0-U+25FF) => 96 / 24 / 0
Miscellaneous Symbols (U+2600-U+26FF) => 256 / 21 / 0
Latin Extended-C (U+2C60-U+2C7F) => 32 / 21 / 0
Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 1 / 0
Modifier Tone Letters (U+A700-U+A71F) => 32 / 9 / 0
Latin Extended-D (U+A720-U+A7FF) => 193 / 7 / 0
Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 48 / 0
Combining Half Marks (U+FE20-U+FE2F) => 16 / 4 / 0
Specials (U+FFF0-U+FFFF) => 5 / 1 / 0
Unicode coverage = 2327 / 149476 = 1.56
=== DejaVu Sans
Basic Latin (U+0020-U+007F) => 95 / 95 / 0
Latin-1 Supplement (U+00A0-U+00FF) => 96 / 96 / 0
Latin Extended-A (U+0100-U+017F) => 128 / 128 / 0
Latin Extended-B (U+0180-U+024F) => 208 / 208 / 0
IPA Extensions (U+0250-U+02AF) => 96 / 96 / 0
Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 63 / 0
Combining Diacritical Marks (U+0300-U+036F) => 112 / 93 / 0
Greek and Coptic (U+0370-U+03FF) => 135 / 135 / 0
Cyrillic (U+0400-U+04FF) => 256 / 256 / 0
Cyrillic Supplement (U+0500-U+052F) => 48 / 38 / 0
Armenian (U+0530-U+058F) => 91 / 86 / 0
Hebrew (U+0590-U+05FF) => 88 / 54 / 0
Arabic (U+0600-U+06FF) => 256 / 165 / 0
NKo (U+07C0-U+07FF) => 62 / 54 / 0
Thai (U+0E00-U+0E7F) => 87 / 1 / 0
Lao (U+0E80-U+0EFF) => 83 / 65 / 0
Georgian (U+10A0-U+10FF) => 88 / 83 / 0
Unified Canadian Aboriginal Syllabics (U+1400-U+167F) => 640 / 404 / 0
Ogham (U+1680-U+169F) => 29 / 29 / 0
Phonetic Extensions (U+1D00-U+1D7F) => 128 / 106 / 0
Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 38 / 0
Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 6 / 0
Latin Extended Additional (U+1E00-U+1EFF) => 256 / 252 / 0
Greek Extended (U+1F00-U+1FFF) => 233 / 233 / 0
General Punctuation (U+2000-U+206F) => 111 / 107 / 0
Superscripts and Subscripts (U+2070-U+209F) => 42 / 42 / 0
Currency Symbols (U+20A0-U+20CF) => 33 / 26 / 0
Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 7 / 0
Letterlike Symbols (U+2100-U+214F) => 80 / 75 / 0
Number Forms (U+2150-U+218F) => 60 / 55 / 0
Arrows (U+2190-U+21FF) => 112 / 112 / 0
Mathematical Operators (U+2200-U+22FF) => 256 / 256 / 0
Miscellaneous Technical (U+2300-U+23FF) => 256 / 65 / 0
Control Pictures (U+2400-U+243F) => 39 / 2 / 0
Enclosed Alphanumerics (U+2460-U+24FF) => 160 / 10 / 0
Box Drawing (U+2500-U+257F) => 128 / 128 / 0
Block Elements (U+2580-U+259F) => 32 / 32 / 0
Geometric Shapes (U+25A0-U+25FF) => 96 / 96 / 0
Miscellaneous Symbols (U+2600-U+26FF) => 256 / 189 / 0
Dingbats (U+2700-U+27BF) => 192 / 174 / 0
Miscellaneous Mathematical Symbols-A (U+27C0-U+27EF) => 48 / 9 / 0
Supplemental Arrows-A (U+27F0-U+27FF) => 16 / 16 / 0
Braille Patterns (U+2800-U+28FF) => 256 / 256 / 0
Supplemental Arrows-B (U+2900-U+297F) => 128 / 6 / 0
Miscellaneous Mathematical Symbols-B (U+2980-U+29FF) => 128 / 13 / 0
Supplemental Mathematical Operators (U+2A00-U+2AFF) => 256 / 74 / 0
Miscellaneous Symbols and Arrows (U+2B00-U+2BFF) => 253 / 35 / 0
Latin Extended-C (U+2C60-U+2C7F) => 32 / 31 / 0
Georgian Supplement (U+2D00-U+2D2F) => 40 / 38 / 0
Tifinagh (U+2D30-U+2D7F) => 59 / 55 / 0
Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 7 / 0
Yijing Hexagram Symbols (U+4DC0-U+4DFF) => 64 / 64 / 0
Lisu (U+A4D0-U+A4FF) => 48 / 48 / 0
Cyrillic Extended-B (U+A640-U+A69F) => 96 / 33 / 0
Modifier Tone Letters (U+A700-U+A71F) => 32 / 20 / 0
Latin Extended-D (U+A720-U+A7FF) => 193 / 77 / 0
Private Use Area (U+E000-U+F8FF) => 0 / 0 / 96
Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 58 / 0
Arabic Presentation Forms-A (U+FB50-U+FDFF) => 631 / 108 / 0
Variation Selectors (U+FE00-U+FE0F) => 16 / 16 / 0
Combining Half Marks (U+FE20-U+FE2F) => 16 / 4 / 0
Arabic Presentation Forms-B (U+FE70-U+FEFF) => 141 / 141 / 0
Specials (U+FFF0-U+FFFF) => 5 / 5 / 0
Old Italic (U+10300-U+1032F) => 39 / 35 / 0
Tai Xuan Jing Symbols (U+1D300-U+1D35F) => 87 / 87 / 0
Mathematical Alphanumeric Symbols (U+1D400-U+1D7FF) => 996 / 117 / 0
Arabic Mathematical Alphabetic Symbols (U+1EE00-U+1EEFF) => 143 / 74 / 0
Domino Tiles (U+1F030-U+1F09F) => 100 / 100 / 0
Playing Cards (U+1F0A0-U+1F0FF) => 82 / 59 / 0
Miscellaneous Symbols and Pictographs (U+1F300-U+1F5FF) => 768 / 12 / 0
Emoticons (U+1F600-U+1F64F) => 80 / 64 / 0
Unicode coverage = 5822 / 149476 = 3.89
$ diff -u liberation.txt dejavu.txt
--- liberation.txt 2024-05-26 13:25:48.184324659 +0200
+++ dejavu.txt 2024-05-26 13:25:26.328369116 +0200
@@ -1,40 +1,75 @@
-=== Liberation Sans
+=== DejaVu Sans
Basic Latin (U+0020-U+007F) => 95 / 95 / 0
Latin-1 Supplement (U+00A0-U+00FF) => 96 / 96 / 0
Latin Extended-A (U+0100-U+017F) => 128 / 128 / 0
Latin Extended-B (U+0180-U+024F) => 208 / 208 / 0
IPA Extensions (U+0250-U+02AF) => 96 / 96 / 0
-Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 80 / 0
-Combining Diacritical Marks (U+0300-U+036F) => 112 / 112 / 0
-Greek and Coptic (U+0370-U+03FF) => 135 / 127 / 0
+Spacing Modifier Letters (U+02B0-U+02FF) => 80 / 63 / 0
+Combining Diacritical Marks (U+0300-U+036F) => 112 / 93 / 0
+Greek and Coptic (U+0370-U+03FF) => 135 / 135 / 0
Cyrillic (U+0400-U+04FF) => 256 / 256 / 0
-Cyrillic Supplement (U+0500-U+052F) => 48 / 24 / 0
-Hebrew (U+0590-U+05FF) => 88 / 87 / 0
-Phonetic Extensions (U+1D00-U+1D7F) => 128 / 128 / 0
-Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 64 / 0
-Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 13 / 0
-Latin Extended Additional (U+1E00-U+1EFF) => 256 / 247 / 0
+Cyrillic Supplement (U+0500-U+052F) => 48 / 38 / 0
+Armenian (U+0530-U+058F) => 91 / 86 / 0
+Hebrew (U+0590-U+05FF) => 88 / 54 / 0
+Arabic (U+0600-U+06FF) => 256 / 165 / 0
+NKo (U+07C0-U+07FF) => 62 / 54 / 0
+Thai (U+0E00-U+0E7F) => 87 / 1 / 0
+Lao (U+0E80-U+0EFF) => 83 / 65 / 0
+Georgian (U+10A0-U+10FF) => 88 / 83 / 0
+Unified Canadian Aboriginal Syllabics (U+1400-U+167F) => 640 / 404 / 0
+Ogham (U+1680-U+169F) => 29 / 29 / 0
+Phonetic Extensions (U+1D00-U+1D7F) => 128 / 106 / 0
+Phonetic Extensions Supplement (U+1D80-U+1DBF) => 64 / 38 / 0
+Combining Diacritical Marks Supplement (U+1DC0-U+1DFF) => 64 / 6 / 0
+Latin Extended Additional (U+1E00-U+1EFF) => 256 / 252 / 0
Greek Extended (U+1F00-U+1FFF) => 233 / 233 / 0
-General Punctuation (U+2000-U+206F) => 111 / 57 / 0
-Superscripts and Subscripts (U+2070-U+209F) => 42 / 22 / 0
-Currency Symbols (U+20A0-U+20CF) => 33 / 23 / 0
-Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 1 / 0
-Letterlike Symbols (U+2100-U+214F) => 80 / 9 / 0
-Number Forms (U+2150-U+218F) => 60 / 7 / 0
-Arrows (U+2190-U+21FF) => 112 / 8 / 0
-Mathematical Operators (U+2200-U+22FF) => 256 / 18 / 0
-Miscellaneous Technical (U+2300-U+23FF) => 256 / 4 / 0
-Box Drawing (U+2500-U+257F) => 128 / 40 / 0
-Block Elements (U+2580-U+259F) => 32 / 8 / 0
-Geometric Shapes (U+25A0-U+25FF) => 96 / 24 / 0
-Miscellaneous Symbols (U+2600-U+26FF) => 256 / 21 / 0
-Latin Extended-C (U+2C60-U+2C7F) => 32 / 21 / 0
-Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 1 / 0
-Modifier Tone Letters (U+A700-U+A71F) => 32 / 9 / 0
-Latin Extended-D (U+A720-U+A7FF) => 193 / 7 / 0
-Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 48 / 0
+General Punctuation (U+2000-U+206F) => 111 / 107 / 0
+Superscripts and Subscripts (U+2070-U+209F) => 42 / 42 / 0
+Currency Symbols (U+20A0-U+20CF) => 33 / 26 / 0
+Combining Diacritical Marks for Symbols (U+20D0-U+20FF) => 33 / 7 / 0
+Letterlike Symbols (U+2100-U+214F) => 80 / 75 / 0
+Number Forms (U+2150-U+218F) => 60 / 55 / 0
+Arrows (U+2190-U+21FF) => 112 / 112 / 0
+Mathematical Operators (U+2200-U+22FF) => 256 / 256 / 0
+Miscellaneous Technical (U+2300-U+23FF) => 256 / 65 / 0
+Control Pictures (U+2400-U+243F) => 39 / 2 / 0
+Enclosed Alphanumerics (U+2460-U+24FF) => 160 / 10 / 0
+Box Drawing (U+2500-U+257F) => 128 / 128 / 0
+Block Elements (U+2580-U+259F) => 32 / 32 / 0
+Geometric Shapes (U+25A0-U+25FF) => 96 / 96 / 0
+Miscellaneous Symbols (U+2600-U+26FF) => 256 / 189 / 0
+Dingbats (U+2700-U+27BF) => 192 / 174 / 0
+Miscellaneous Mathematical Symbols-A (U+27C0-U+27EF) => 48 / 9 / 0
+Supplemental Arrows-A (U+27F0-U+27FF) => 16 / 16 / 0
+Braille Patterns (U+2800-U+28FF) => 256 / 256 / 0
+Supplemental Arrows-B (U+2900-U+297F) => 128 / 6 / 0
+Miscellaneous Mathematical Symbols-B (U+2980-U+29FF) => 128 / 13 / 0
+Supplemental Mathematical Operators (U+2A00-U+2AFF) => 256 / 74 / 0
+Miscellaneous Symbols and Arrows (U+2B00-U+2BFF) => 253 / 35 / 0
+Latin Extended-C (U+2C60-U+2C7F) => 32 / 31 / 0
+Georgian Supplement (U+2D00-U+2D2F) => 40 / 38 / 0
+Tifinagh (U+2D30-U+2D7F) => 59 / 55 / 0
+Supplemental Punctuation (U+2E00-U+2E7F) => 94 / 7 / 0
+Yijing Hexagram Symbols (U+4DC0-U+4DFF) => 64 / 64 / 0
+Lisu (U+A4D0-U+A4FF) => 48 / 48 / 0
+Cyrillic Extended-B (U+A640-U+A69F) => 96 / 33 / 0
+Modifier Tone Letters (U+A700-U+A71F) => 32 / 20 / 0
+Latin Extended-D (U+A720-U+A7FF) => 193 / 77 / 0
+Private Use Area (U+E000-U+F8FF) => 0 / 0 / 96
+Alphabetic Presentation Forms (U+FB00-U+FB4F) => 58 / 58 / 0
+Arabic Presentation Forms-A (U+FB50-U+FDFF) => 631 / 108 / 0
+Variation Selectors (U+FE00-U+FE0F) => 16 / 16 / 0
Combining Half Marks (U+FE20-U+FE2F) => 16 / 4 / 0
-Specials (U+FFF0-U+FFFF) => 5 / 1 / 0
-Unicode coverage = 2327 / 149476 = 1.56
+Arabic Presentation Forms-B (U+FE70-U+FEFF) => 141 / 141 / 0
+Specials (U+FFF0-U+FFFF) => 5 / 5 / 0
+Old Italic (U+10300-U+1032F) => 39 / 35 / 0
+Tai Xuan Jing Symbols (U+1D300-U+1D35F) => 87 / 87 / 0
+Mathematical Alphanumeric Symbols (U+1D400-U+1D7FF) => 996 / 117 / 0
+Arabic Mathematical Alphabetic Symbols (U+1EE00-U+1EEFF) => 143 / 74 / 0
+Domino Tiles (U+1F030-U+1F09F) => 100 / 100 / 0
+Playing Cards (U+1F0A0-U+1F0FF) => 82 / 59 / 0
+Miscellaneous Symbols and Pictographs (U+1F300-U+1F5FF) => 768 / 12 / 0
+Emoticons (U+1F600-U+1F64F) => 80 / 64 / 0
+Unicode coverage = 5822 / 149476 = 3.89
Re:availability, it does not seem too bad either:
Hello, sdlmame defaults to Liberation sans font which, among others, lacks some of the symbols needed for vgmplay. Inspired by Fedora 36 switching to Google Noto fonts I checked whether Google Noto Sans could provide the needed symbols. It does, but the symbols are part of Google Noto Symbols2:
Would it be possible to expand the font code so that it combines Noto Sans and Noto Sans Symbols2? The other option would be to switch to DejaVu font but as the distros are moving away from it, it does not seem like the best choice.