Closed gusbemacbe closed 10 months ago
The fonts by George Douros have old Greek hieroglyphs, as Linear A and B, and hyperglot
did not detect that these fonts have these scripts.
I solved this issue and open the pull request, adding support for the new languages that hyperglot
could not detect, in reference to #110 .
I improved the following code:
#!/usr/bin/env python
import os
import yaml
from hyperglot.parse import parse_font_chars
from hyperglot.language import Language, Orthography
from hyperglot.languages import Languages
# Loading the languages.yml file
languages = yaml.load(open('scripts/yaml/languages.yml'), Loader=yaml.FullLoader)
# Font name
font_name = "Fontes monoespaçadas – Código aberto – Unifont"
# Finding all folders until finding the font file name
def find_font_file(path):
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith(".ttf") or file.endswith(".otf") or file.endswith(".woff"):
if font_name in file:
return os.path.join(root, file)
def test_languages():
# Path to the font file
font_file = find_font_file('.')
# Parsing the font file
chars = parse_font_chars(font_file)
Langs = Languages()
supported = Langs.supported(chars, includeAllOrthographies=True,
includeHistorical=True,
includeConstructed=True)
# Writing a Markdown file
with open('Apoio linguístico por fonte/{}.md'.format(font_name), 'w') as f:
f.write('#### {}\n\n' .format(font_name))
f.write('* Idiomas com alfabeto latino:\n')
for lang in languages:
if lang["iso"] in supported["Latin"]:
if lang["iso"] == "jpn":
f.write("\t* Japonês (*romaji*)\n")
else:
f.write("\t* {}\n" .format(lang["por"]))
# If the font does not have Greek script
if "Greek" not in supported:
f.write("")
else:
f.write('\n* Idiomas com alfabeto grego:\n')
for lang in languages:
if lang["iso"] in supported["Greek"]:
f.write("\t* {}\n".format(lang["por"]))
# If the font does not have Japanese script
if "Kanji" not in supported or "Hiragana" not in supported or "Katakana" not in supported:
f.write("")
else:
f.write('\n* Idiomas com sílabas japoneses:\n')
for lang in languages:
if lang["iso"] in supported["Kanji"] or lang["iso"] in supported["Hiragana"] or lang["iso"] in supported["Katakana"]:
f.write("\t* {}\n".format(lang["por"]))
if __name__ == '__main__':
test_languages()
It checked that Japanese is in the list of languages with Latin script and inserted it in the list.
I also tested another font with Japanese script (“Unifont”) and it didn't detect the Japanese language.
For the Unifont included in your zip file for testing from the PR I cannot confirm this for the CLI. Perhaps your script uses different defaults for the language support detection? hyperglot path/to/Unifont.ttf
does list Japanese for Latin, Katagana, Hiragana, and Kanji.
The font “Aussan” does not have Japanese script.
Are you saying the Aussan font does not have Japanese support detected when it should, or that is has support detected but shouldn't? Could you post the output of using parse_font_chars
on that file?
Good morning!
I was building a Python script that gets all my favourite languages from my YAML file and checks if one font supports these specific languages, then generates a Markdown file with a list of supported languages.
The font “Aussan” does not have Japanese script.
It checked that Japanese is in the list of languages with Latin script and inserted it in the list.
I also tested another font with Japanese script (“Unifont”) and it didn't detect the Japanese language.
languages.yml
:generate-language-support-list.py
:Output: