Open dahlia opened 8 years ago
Thanks for your report. I'm unable to investigate this issue at the moment, but I'll try to re-visit this sometime this week.
Sorry for the late response. It's almost been a year 😆
I looked into this briefly, and it looks like there is no easy way to deal with this issue other than making a huge rule table. Or maybe I'm missing something... If anyone could suggest a solution for this, it would be much appreciated.
A huge rule table would be the easiest solution. Since you are using a mapping file, if you have a list of phrases that do not use the most common reading, you can place those phrases on top of your file.
For example, the hanja 金 has two readings 금 and 김. 김 is everywhere and you can't possibly list out all names with the reading 김. If you have a list of all '金' words, e.g. 대금 (代金), 금고 (金庫), 금요일 (金曜日) etc., put them on the top of your list. If there is no matched phrases, then use the default pronunciation. The table will grow 10 times bigger but it should not affect run-time too much.
Some hanjas like 金/讀/畵 can be pronounced in different ways. The current behavior can produce incorrect results in some cases e.g.:
See also the following table: