Open saikotek opened 2 years ago
Hello. The kana conversion module doesn't do anything to the ー
character. It only converts kana characters.
センセー
becomes せんせー
after conversion which I think is correct.
the accent_dict data contains fields only in katakana
It was originally this way. The pitch accent data used in the add-on was contributed by javdejong back in 2012.
If what you need is converting セー
to せえ
(and I assume other similar pairs), we could think about how to implement it, but it's not the issue of the kana converter module.
I see. I believe then it could implemented by instead of doing to_katakana() conversion, convert kanji to furigana?
I don't think converting kanji to furigana is necessary. Having a simple dictionary that would map kana pairs would be the most obvious solution.
E.g.:
etc.
Yeah but is "ー" always used to mark long vowel in おう、えい and not in おお、いい、ええ?
Hard to tell. We need a set of examples to draw a conclusion on how to do the conversion.
Hello, I came here after using a great Anki plugin of yours called PitchAccent. I've noticed the issue when trying to convert pitch pattern to hiragana that it doesn't handle long vowel mark ー properly. Turns out that it isn't that easy to convert katakana to hiragana because of the fact that there are two ways to make vowel longer. If we would simply try to reverse "ー" character based on the preceding vowel it would make words like せんせえ (if the original data is written as センセー).
It would be the best to reverse the conversion workflow, make accents originally in hiragana and then it would be possible to convert to katakana deterministically, right? For that you need to have the original data in hiragana but from what I've seen the accent_dict data contains fields only in katakana, perhaps you cut out hiragana fields?
I prefer to use hiragana in pitch pattern so I can simply use that instead of vocab kana field in Anki. If it's too hard - don't mind it. Thanks for your hard work. よろしくお願いいたします。