Closed murata2makoto closed 8 months ago
To my surprise, ChatGPT (V3.5) is already quite good at reading the above examples!
This issue does not contain proposed changes. Moreover, as demonstrated by ChatGPT, it is possible to handle many possible readings of 生 automatically. I will close this issue.
@aleventhal wrote as a comment to another issue in a different repository:
MM>Their implementations examine code points of base characters.
This web page by a Japanese ministry shows which kanji can be read how. But for pedagogical reasons, it oversimplifies the mess of kanji. For example, consider 生, which is for the first grade in Japanese elementary schools. Only 12 phonetics of this kanji character are listed.
But the reality is different. Difficulties in reading a kanji character in a particular context do not necessarily relate to the difficulties of that kanji character.
There are more than 100 ways of reading this character: 生きる(ikiru), 生える(haeru), 生む(umu), 先生(sensei), 生も の(nama mono), 生糸(kiito), 生い立ち(oitachi), 弥生(yayoi), 生憎 (ainiku), 生さぬ仲(nasanu naka), 苔の生すまで(kokeno musumade) , 生簀 (ikesu), 早生(wase), 晩生(okute), 芝生(shibafu), 生業(nariwai), 生粋 (kissui), and so forth. If we consider proper names such as 福生, 羽生, 生保内, and 壬生, things will become even more difficult. I do not believe that the required heuristics will be written down and implemented in the near future.
This page shows which kanji is taught in which grade in K12. A DAISY reader uses this list for hiding kanji characters below the specified grade. But if we really want to mimic printed textbooks, we will have to know which kanji is taught in which semester. Different textbooks teach kanji characters in different orders.