Closed jsbeckr closed 3 years ago
This is a not uncommon encoding issue. python-tcod expects codepoints to be in Unicode, but the REXPaint tool saves them as EASCII.
You can use tcod.FONT_LAYOUT_ASCII_INROW
to force libtcod to use EASCII, but then you'll no longer be able to give Unicode strings such as "░▒▓"
to print functions. You can also decode the codepoints from CP437 to Unicode after the REXPaint file is loaded.
Thanks for your quick answer!
tcod.FONT_LAYOUT_ASCII_INROW
does work, but as you said it's unfortunate that I can't print Unicode chars anymore. I tried to figure out how to decode the CP437 codepoints to Unicode.
From my_console.tiles
i get tuples like (23, fg, bg)
where 23 is the unicode codepoint. I'm not sure how to get from a unicode to a CP437 codepoint. Or am I approaching this wrong?
It's actually hard to do since the codepoints < 32 are special and most codecs such as the ones Python has will refuse to touch them. "".encode("cp437")
should work for the higher codes. "\xNN"
also works as expected when working with EASCII codepoints.
I usually use these two pages as references when making a custom codec:
https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP437.TXT https://en.wikipedia.org/wiki/Code_page_437
Maybe I should add a way to access the LAYOUT_CP437 codec from Python.
I guess I will go the CSV route for now to stay in unicode land. :)
I can give an example on how to decode REXPaint files into Unicode if you want.
Once the REXPaint file is loaded as a console you can convert it. Here's an example:
import tcod.console
import numpy as np
cp437 = np.array(
[
0x0000, 0x263A, 0x263B, 0x2665, 0x2666, 0x2663, 0x2660, 0x2022,
0x25D8, 0x25CB, 0x25D9, 0x2642, 0x2640, 0x266A, 0x266B, 0x263C,
0x25BA, 0x25C4, 0x2195, 0x203C, 0x00B6, 0x00A7, 0x25AC, 0x21A8,
0x2191, 0x2193, 0x2192, 0x2190, 0x221F, 0x2194, 0x25B2, 0x25BC,
0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027,
0x0028, 0x0029, 0x002A, 0x002B, 0x002C, 0x002D, 0x002E, 0x002F,
0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037,
0x0038, 0x0039, 0x003A, 0x003B, 0x003C, 0x003D, 0x003E, 0x003F,
0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047,
0x0048, 0x0049, 0x004A, 0x004B, 0x004C, 0x004D, 0x004E, 0x004F,
0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057,
0x0058, 0x0059, 0x005A, 0x005B, 0x005C, 0x005D, 0x005E, 0x005F,
0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067,
0x0068, 0x0069, 0x006A, 0x006B, 0x006C, 0x006D, 0x006E, 0x006F,
0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077,
0x0078, 0x0079, 0x007A, 0x007B, 0x007C, 0x007D, 0x007E, 0x007F,
0x00C7, 0x00FC, 0x00E9, 0x00E2, 0x00E4, 0x00E0, 0x00E5, 0x00E7,
0x00EA, 0x00EB, 0x00E8, 0x00EF, 0x00EE, 0x00EC, 0x00C4, 0x00C5,
0x00C9, 0x00E6, 0x00C6, 0x00F4, 0x00F6, 0x00F2, 0x00FB, 0x00F9,
0x00FF, 0x00D6, 0x00DC, 0x00A2, 0x00A3, 0x00A5, 0x20A7, 0x0192,
0x00E1, 0x00ED, 0x00F3, 0x00FA, 0x00F1, 0x00D1, 0x00AA, 0x00BA,
0x00BF, 0x2310, 0x00AC, 0x00BD, 0x00BC, 0x00A1, 0x00AB, 0x00BB,
0x2591, 0x2592, 0x2593, 0x2502, 0x2524, 0x2561, 0x2562, 0x2556,
0x2555, 0x2563, 0x2551, 0x2557, 0x255D, 0x255C, 0x255B, 0x2510,
0x2514, 0x2534, 0x252C, 0x251C, 0x2500, 0x253C, 0x255E, 0x255F,
0x255A, 0x2554, 0x2569, 0x2566, 0x2560, 0x2550, 0x256C, 0x2567,
0x2568, 0x2564, 0x2565, 0x2559, 0x2558, 0x2552, 0x2553, 0x256B,
0x256A, 0x2518, 0x250C, 0x2588, 0x2584, 0x258C, 0x2590, 0x2580,
0x03B1, 0x00DF, 0x0393, 0x03C0, 0x03A3, 0x03C3, 0x00B5, 0x03C4,
0x03A6, 0x0398, 0x03A9, 0x03B4, 0x221E, 0x03C6, 0x03B5, 0x2229,
0x2261, 0x00B1, 0x2265, 0x2264, 0x2320, 0x2321, 0x00F7, 0x2248,
0x00B0, 0x2219, 0x00B7, 0x221A, 0x207F, 0x00B2, 0x25A0, 0x00A0,
]
)
cp437_encode = {v: i for i, v in enumerate(cp437)}
console = tcod.console.Console(5, 1)
console.tiles["ch"][0, :4] = ord("♥"), ord("♦"), ord("♣"), ord("♠")
print(console.tiles["ch"])
# Encode Unicode -> CP437
console.tiles["ch"] = np.vectorize(cp437_encode.__getitem__)(console.tiles["ch"])
print(console.tiles["ch"])
# Decode CP437 -> Unicode
console.tiles["ch"] = cp437[console.tiles["ch"]]
print(console.tiles["ch"])
Output:
[[9829 9830 9827 9824 32]]
[[ 3 4 5 6 32]]
[[9829 9830 9827 9824 32]]
Oh wow nice! Works perfectly. 👍
Maybe this should be the default behavior for Consoles
loaded via tcod.console_from_xp
?
The REXPaint spec actually does support Unicode, but it doesn't track what encoding it was saved with. So after you decode into Unicode you can save that back as a Unicode .xp
file, then you can load that without needing to decode it again. An .xp
file saved as Unicode can't be loaded by the REXPaint program, which expects CP437 again.
The ideal way to handle this is to allow a character mapping to be passed to the load and save functions. Similar to opening a file with a codec in Python. I can't use Python's existing codecs since they don't convert the 0-32
characters.
The documentation now includes examples on how to convert the encoding between CP437 and Unicode when loading or saving REXPaint files.
I'm using Rexpaint and load the resulting .xp files with
tcod.console_from_xp("data/images/title.xp")
. When I blit the resulting console it looks different then the Rexpaint image.Example (left: Rexpaint, right: tcod)
I'm using a custom font (its a CP437 one), but it also happens with the standard font.