veekun / pokedex

more than you ever wanted to know about Pokémon
MIT License
1.44k stars 637 forks source link

Weird double nbsp in JP OR/AS dex entries #308

Closed magical closed 4 years ago

magical commented 4 years ago

Here's the ja-Hrkt dex entry for Rattata in Alpha Sapphire, with % standing in for U+202F NARROW NO-BREAK SPACE.

19,26,1,"けいかいしんが とても つよく ねている ときも % %を うごかし まわりの おとを きいている。 どこにでも す%つき す を つくる。"

The "% %を" at the start of the second line can't possibly be right. The kanji-ful version has "耳を" in the same position.

19,26,11,"警戒心が とても 強く 寝ている ときも 耳を 動かし 周りの 音を 聞いている。 どこにでも す%つき 巣 を つくる。"

There are several other examples in other pokemon's dex entries.

magical commented 4 years ago

Actually that す%つき in the third line is suspicious too. I think it should be すみつ き. And 耳 is みみ. So the problem is probably that all み were replaced with nbsp somehow.

magical commented 4 years ago

Ah, this was fixed in the ripper already (https://github.com/veekun/pokedex/commit/6cc2e4439c9430b050c278ee86be680e3b0e86a0) but the existing data was never corrected.. Just need to do a re-rip.