Open uwodb opened 4 years ago
The entire Japanese character set is quite large, so they came up with a custom encoding to optimize it.
Check out the decompiled code here to get a general idea of how it works: https://github.com/diabpsx/skeleton/blob/master/JAP_1998_05_29/DIABPSX/PSXSRC/KANJI.CPP
The .OUT
file is the font data which contains only the characters used for the text. The .JAP
language file is an optimized SHIFT-JIS that points to indices in that file.
The .LGH
file is the file you want. It's the regular SHIFT-JIS encoded Japanese text before it was optimized. My guess is they left these on the disc by accident since they aren't used which is funny since it defeats the purpose of reducing file sizes.
All work related to the PSX version is being continued in the aforementioned link, pretty sure this repo is dead since I left.
WOW! This is what I'm looking for :) Thank you for your help.
There was a set of about 60 beta Japanese builds for the PSX sold at auction a few years ago. Don't forget to tell your friends that help is needed and ask if they know anything about the matter. Those discs could contain the source code or original assets, so I don't have to spend countless hours reversing this crap. We could have Kanji support sooner ;)
My goal is Korean & Japanese language support in devilutionX and I wanted to read the PSX's Japanese text. Thank you :) https://github.com/diasurgical/devilution/issues/1762 It seemed to me that they want to keep the original. So I give up using ttf and I'm using my customized bitmap font.
Please see https://github.com/diasurgical/devilutionX/issues/66 for the latest status regarding translation and font handling in DevilutionX.
Thankfully we have fonts for both Korean and Japanese! See here: https://d2mods.info/forum/viewtopic.php?t=55894
Diablo 2 has pixel mapped fonts ready for color transforming, in the same sizes as d1! So 42, 30, 24, and 16 pixels! Diablo 2 uses utf8 iirc rather than shift jis.
Really? I didn't know what Diablo 2 used bitmap font. :( Warcraft(1994) bitmap, ascii Warcraft II(1995) bitmap, ascii Diablo(1996) bitmap, ascii Starcraft(1998) bitmap, ascii Diablo II(2000) bitmap?, unicode? Warcraft III(2002) ttf, unicode I see...
D1 uses TTF for some UI elements.
Diablo 2 has the standard ascii bitmap font for most languages. It has a unicode font for japanese, korean, and chinese. There is also an ascii Russian and Polish font. On top of that it has some extra fonts like a small font and formal font for typing+UI.
Diablo 1 had a combination of pixel fonts and TTF. The developers outsourced the UI to Blizzard South who in turn got lazy and decided to use proprietary TTF for formal fonts instead of coming up with a proper font like they did in Diablo 2.
See below an example of things you can do with the 6pt small font from D2.
Added font dumping tool. So just as I feared it looks like there isn't a way to directly translate the Japanese text files back. Since the special characters (0x8000+) just point to pixel data, one would either need a Japanese keyboard or have some sort of OCR that can remap them to standard SJIS/Unicode. Example: If the file has the character 0x8265
, it references an address in MAINTXT.OUT
which has a bit-based pixel font. Here is the output from the tool, with the pound symbol representing the pixels:
----- Id 34 (0x8265) -----
#
#
########
#
#
# #####
## #
# #
#
#
#####
Great. Translation of the PSX version is possible as well as translation of DevilutionX.
If anyone is still reading this, I'm in need of a Japanese speaker for assistance. I have a file that has all Japanese glyphs not mapped printed out like above, but I have no idea what they mean and need them typed out so I can map the Shift-JIS code to the binary.
Edit: there are about a total of 100 glyphs left. So it shouldn't take too long.
I will see what I can do :) Show me your file and I'll take a look
That's great @uwodb ! Below is a file with all the missing characters, printed out to look like pixels. I've already mapped the rest out with automation. Once we have these last things mapped we can convert the lore text back into text format.
this results are not always accurate because it is not automated :( 9333=競 9369=協 938D=境 93D5=況 93F9=狭 940B=胸 942F=響 9453=凝 94BF=僅 94D1=緊 953D=屈 9585=係 95BB=形 95CD=掲 9603=継 9627=警 97C5=固 97D7=弧 97FB=互 9831=誤 9843=交 98C1=坑 98D3=拘 98E5=控 9951=郊 9A29=困 9AA7=鎖 9ACB=挫 9B13=砕 9B25=際 9B6D=策 9B7F=索 9B91=錯 9BA3=擦 9BEB=惨 9C45=刺 9CD5=市 9CF9=志 9D65=至 9E07=七 9E2B=嫉 9E3D=室 9ECD=釈 9F03=惹 9FC9=宗 A023=讐 A047=醜 A07D=従? A0E9=瞬 A0FB=殉 A11F=巡 A179=諸 A1AF=序 A1C1=徐 A1F7=召 A251=尚 A275=晶 A2BD=証 A329=状 A35F=伸 A383=侵 A3B9=浸 A3DD=申 A46D=陣 A4B5=遂 A4D9=枢 A69B=節 A6E3=宣 A7BB=疎 A7DF=訴 A803=創 A815=双 A86F=巣 A8A5=窓 A8B7=総 A8C9=荘 A8FF=送 A96B=側 A97D=則 AA0D=孫 AA1F=尊 AA31=村 AA67=唾 AAD3=態 ABBD=担 AC05=弾 AC83=致 ACA7=秩 ACB9=着 AD7F=頂 ADA3=沈 AE33=廷 AE7B=徹 AE9F=展 AEB1=転 AEF9=堵 AF0B=塗 AF1D=妬 AFD1=統 B073=洞 B0A9=徳 B0DF=独 B235=波 B26B=廃 B2C5=薄 B2D7=迫 B30D=肌 B331=罰 B38B=繁 B3AF=卑 B3E5=比 B3F7=疲 B4AB=貧 B4E1=布 B53B=赴 B595=風 B5A7=副 B5DD=福 B649=奮 B67F=併 B757=奉 B7D5=飽 B7F9=妨 B853=謀 B865=貌 B877=貿 B8BF=摩 B9BB=務 B9F1=冥 BA39=盟 BA5D=鳴 BAA5=模 BAC9=猛 BB11=悶 BB7D=躍 BBD7=有 BBFB=裕 BC31=余 BC55=余 BCD3=踊 BCE5=遥 BD09=浴 BDAB=律 BE17=侶 BE3B=虜 BE71=糧 BEA7=臨 BEDD=令 BEEF=冷 BF7F=路 BFA3=弄 BFD9=論 C021=枠 C033=墟 C045=愕 C057=枷 C069=沐 C07B=狡 C08D=禍 C09F=瞞 C0B1=膠 C0C3=貪 C0D5=踪
Wow, incredible work @uwodb! Very thankful you typed these out, as it would have taken me forever fiddling with OCR software and the like. As a result, all but two characters are mapped and everything seems to be translating back correctly!
When you get the chance, could you have a second look at BC55
and C08D
? BC55
appears to be a duplicate of BC31
but it looks a bit different.
Please find below all of the game's text restored back into Shift-JIS!!! Note that the lore section is missing those two characters and may have some slight errors, the other two should be perfect.
Wow! Thanks for noticing. BC31=余 BC55=幼 C08D=猾 What's the next plan?
I guess once translation support is complete the text can be used. I'm working on my own game engine, but progress is a bit slow. I'd anticipate you should have translation support soon in DevilutionX, though fonts are missing for asian languages. For now you can use the Diablo 2 versions.
text.zip
Do you know what encoding it is used? I don't know Can anyone help me with this? Thanks in advance :)