polm / cutlet

Japanese to romaji converter in Python
https://polm.github.io/cutlet/
MIT License
299 stars 21 forks source link

KeyError: 'ヸ' #26

Closed mhagiwara closed 3 years ago

mhagiwara commented 3 years ago
katsu.romaji("秋の日のヸオロンのためいきの身にしみてひたぶるにうら悲し。")

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_2037/678387450.py in <module>
----> 1 katsu.romaji("秋の日のヸオロンのためいきの身にしみてひたぶるにうら悲し。")

~/miniconda3/envs/janlpbook/lib/python3.7/site-packages/cutlet/cutlet.py in romaji(self, text, capitalize, title)
    143 
    144             # resolve split verbs / adjectives
--> 145             roma = self.romaji_word(word)
    146             if roma and out and out[-1] == 'っ':
    147                 out = out[:-1] + roma[0]

~/miniconda3/envs/janlpbook/lib/python3.7/site-packages/cutlet/cutlet.py in romaji_word(self, word)
    212             if word.char_type == 6 or word.char_type == 7: # hiragana/katakana
    213                 kana = jaconv.kata2hira(word.surface)
--> 214                 return self.map_kana(kana)
    215 
    216             # At this point this is an unknown word and not kana. Could be

~/miniconda3/envs/janlpbook/lib/python3.7/site-packages/cutlet/cutlet.py in map_kana(self, kana)
    252             nk = kana[ki + 1] if ki < len(kana) - 1 else None
    253             pk = kana[ki - 1] if ki > 0 else None
--> 254             out += self.get_single_mapping(pk, char, nk)
    255         return out
    256 

~/miniconda3/envs/janlpbook/lib/python3.7/site-packages/cutlet/cutlet.py in get_single_mapping(self, pk, kk, nk)
    285             else: return 'n'
    286 
--> 287         return self.table[kk]
    288 

KeyError: 'ヸ'

I think this is an old variant of "ヴィ".

Source: https://tatoeba.org/en/sentences/show/2478013

polm commented 3 years ago

Thanks! Hadn't seen that one before.

This is fixed in master, I'll do a release soon.