pret / pokecrystal

Disassembly of Pokémon Crystal
https://pret.github.io/pokecrystal/
2.06k stars 770 forks source link

Clarify and correct usage of breakable whitespace characters #1094

Closed SnorlaxMonster closed 8 months ago

SnorlaxMonster commented 8 months ago

In charmap.asm, $1f (¯) is described as a "soft linebreak", while $25 (%) is described as "soft linebreak in landmark names". I don't think these are particularly good descriptions of what these codepoints represent.

Firstly, it is $1f that is used in landmark names, not $25 ($25 appears to be entirely unused).

As for what these codepoints actually do: When displayed on the Town Map, they render as line breaks. Everywhere else, the two render differently: $1f renders a space, whereas $25 renders nothing. In light of this, I think a more accurate description is that $1f is a breakable space (as opposed to the standard space character $7f, which is effectively an unbreakable space) and $25 is a zero-width space.

Also, I question the choice of symbols used to represent these two characters. I don't think a macron or percent sign intuitively represents an optional line break, especially given how rarely these two codepoints are actually used. I think just angle-bracketed terms (e.g. <BSP> for $1f and <ZWSP> for $25) like almost every other control character would be clearer.

Other than landmark names, macrons are used within Japanese strings in comments and untranslated mobile data. Both of these usages appear to be in error. In untranslated mobile data, they are used in various strings, but these usages actually should be <WO> (the Japanese control character at $25) instead of the English control character at this code point.

Macrons are used in comments in moves/grammar.asm, in reference to move usage dialogue (particularly Japanese dialogue). However, in the Japanese version these appear to actually be "hard" linebreaks ($22) or "を " instead. (It's possible the latter are <WO> internally, but in this instance I don't think the internal representation is as important as the display, since they are used purely in comments.) Notably, in the Japanese version these strings use 2-tile linebreaks (<LINE>), whereas the two breakable whitespace characters become 1-tile linebreaks (<LF>) on the Town Map in the English version.

In light of these issues, I have made the following changes:

SnorlaxMonster commented 8 months ago

I've applied those suggested changes, updated the other locations that reference <ZWSP> to reference <WBR> instead, and added % and ¯ to macros/legacy.asm

Rangi42 commented 8 months ago

Thanks again! Just merge master to this, because charmap.asm was moved to constants/charmap.asm.

SnorlaxMonster commented 8 months ago

I've merged master into my feature branch, so there are no longer conflicts.