Closed aaronbell closed 1 month ago
I don't understand this. 0x0065
is an integer in Python. What code were using for the merge_ufos
call?
I'm just doing a direct call like:
merge_ufos(
currentFont,
extensionFont,
codepoints=glyphSet,
layout_handling="closure",
existing_handling="skip",
)
Where glyphSet
is defined using a whitelist file that contains hex values.
glyphSet = []
with open('sources/'+lang+"/whitelist.txt") as f:
glyphSet = f.read().splitlines()
and whitelist.txt
is a set of hex values:
0x003A
0x003B
0x003C
0x003D
0x003E
0x003F
0x0040
0x0041
0x0042
0x0043
0x0044
0x0045
0x0046
0x0047
And merge_ufos
can't deal with that because it already assumes they are converted to integers by the CLI
code:
def parse_cp(cp):
if (
cp.startswith("U+")
or cp.startswith("u+")
or cp.startswith("0x")
or cp.startswith("0X")
):
return int(cp[2:], 16)
return int(cp)
No such code exists downstream in __init__.py
so if I bypass CLI.py
and call merge_ufos
directly, I have to do the conversion myself.
You do, and I'm happy with that. The merge_ufos
documentation says:
codepoints: A list of Unicode codepoints as integers.
You passed a list of Unicode codepoints as strings. ufomerge
is not to blame here. :-)
Fair enough. Though I wonder, then, why the CLI allows cp
to be something else? Seems to me that the CLI
codepoints
should behave the same as merge_ufos
codepoints
.
Because command line utilities are intended for end users, and libraries are intended for Python developers. Different audiences.
When trying to use merge_ufos directly, I found that the
codepoints
option doesn't work when provided with a codepoint (like0x0065
). It appears that this is due to the unicode values being stored as integers in ufoLib2 versus hex values.When using the CLI, there's code to parse the unicode values into integers. But in merge_ufos, there's no such code.
IMO, there should be a check in merge_ufos / subset_ufos that determines if the unicode values are in integer form, or hex (and if so parse them).