polm / cutlet

Japanese to romaji converter in Python
https://polm.github.io/cutlet/
MIT License
286 stars 20 forks source link

KeyError: '゙' #16

Closed ykim closed 3 years ago

ykim commented 3 years ago

I ran into an odd issue with the latest version:

% cutlet
青い春よさらば!
Traceback (most recent call last):
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/bin/cutlet", line 8, in <module>
    sys.exit(main())
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cli.py", line 16, in main
    print(katsu.romaji(line.strip()))
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 130, in romaji
    roma = self.romaji_word(word)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 195, in romaji_word
    return self.map_kana(kana)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 235, in map_kana
    out += self.get_single_mapping(pk, char, nk)
  File "/Users/ykim/.local/share/virtualenvs/sandbox-nIHPi2Hu/lib/python3.8/site-packages/cutlet/cutlet.py", line 268, in get_single_mapping
    return self.table[kk]
KeyError: '゙'

I'm not sure what's going on here. :/

polm commented 3 years ago

Ah, that's a combining dakuten - it's a separate character. I hadn't thought of that, I'll need to do Unicode normalization. Should be an easy fix.

polm commented 3 years ago

Should be fixed in master now.

polm commented 3 years ago

Released the fix in v0.1.13. I'm closing this but let me know if you still have issues.