Doublevil / JmdictFurigana

A Japanese dictionary resource that attaches furigana to individual words
150 stars 13 forks source link

Missing 今年 (ことし), 七夕 (たなばた), and others #12

Closed BlueRaja closed 4 years ago

BlueRaja commented 4 years ago

The usual pronunciation of 今年 is ことし, but the only listing in JmdictFurigana is

今年|こんねん|0:こん;1:ねん

Here is the listing within the latest JMDict_e:

<entry>
  <ent_seq>1579130</ent_seq>
  <k_ele>
    <keb>今年</keb>
    <ke_pri>ichi1</ke_pri>
    <ke_pri>news1</ke_pri>
    <ke_pri>nf01</ke_pri>
  </k_ele>
  <r_ele>
    <reb>ことし</reb>
    <re_pri>ichi1</re_pri>
    <re_pri>news1</re_pri>
    <re_pri>nf01</re_pri>
  </r_ele>
  <r_ele>
    <reb>こんねん</reb>
  </r_ele>
  <sense>
    <pos>&n-adv;</pos>
    <pos>&n-t;</pos>
    <gloss>this year</gloss>
  </sense>
</entry>

Another example: 七夕 (たなばた)

七夕|しちせき|0:しち;1:せき
<entry>
  <ent_seq>1579640</ent_seq>
  <k_ele>
    <keb>七夕</keb>
    <ke_pri>ichi1</ke_pri>
    <ke_pri>news1</ke_pri>
    <ke_pri>nf22</ke_pri>
  </k_ele>
  <k_ele>
    <keb>棚機</keb>
  </k_ele>
  <k_ele>
    <keb>織女</keb>
  </k_ele>
  <k_ele>
    <keb>棚幡</keb>
    <ke_inf>&oK;</ke_inf>
  </k_ele>
  <r_ele>
    <reb>たなばた</reb>
    <re_pri>ichi1</re_pri>
    <re_pri>news1</re_pri>
    <re_pri>nf22</re_pri>
  </r_ele>
  <r_ele>
    <reb>しちせき</reb>
    <re_restr>七夕</re_restr>
  </r_ele>
  <sense>
    <pos>&n;</pos>
    <xref>五節句</xref>
    <gloss>Festival of the Weaver (July 7th)</gloss>
    <gloss>Star Festival (one of the five annual festivals)</gloss>
  </sense>
</entry>

Other missing kanji (I haven't manually verified these, so some may be incorrect): 広がった枝 天皇 経緯 三越 東芝 平壌 伊勢丹 蒔絵 正木 晨朝 筑子 水俣病 淀川 高麗 贅沢 浪漫 西藏 可哀相 散らかった紙屑 親父 桔梗 神奈川県 御飯 愛媛県 凛々しい 烏龍茶


FWIW, both 今年|ことし and 七夕|たなばた existed in JmdictFurigana 1.2, but not 1.3. Others, like 天皇|てんのう, have never existed.

Doublevil commented 4 years ago

It was just a bunch of special readings missing from the dedicated file. Some of these readings had been in a previous version of the special readings file but were removed when I implemented a better special readings list. They are now back. I also added some kanji readings to handle stuff like 親父 and 凛々しい.

Some of the expressions from your list are names or sentence parts. If you want to add more proper nouns to the special readings list, you are free to do it, but there are many, many unsolved names in the Jmedict and the aim of this project is not to have a comprehensive list of names (most of the ones that can't be solved are not interesting to cut in furigana parts anyways).

In any case, the main ones that were missing are now part of the 2.2 release.