Open fabd opened 7 years ago
Another potential example for 田 :
水田 (nf07) shows up before 田園 (nf11).
I'm not sure. jisho.org doesn't show 水田 when searching for 田. But JMDICT's description of nfxx
field indicates that nf07 is a higher frequency of use than nf11.
nfxx: this is an indicator of frequency-of-use ranking in the wordfreq file. "xx" is the number of the set of 500 words in which the entry can be found, with "01" assigned to the first 500, "02" to the second, and so on.
Bug
Dict lookup for 明日 should display the あした reading (currently it shows みょうにち).
Background
The SQL query sorts dictionary lookup results with the "priority" field. Most of the time that works alright. In a few cases this cause issues.
For 明日, the third less common reading みょうにち comes first because priority uses 3 bits (ichi1, news1, nf05), thus numerically greater than the first reading あした which uses only "ichi1".
Solutions
need to figure out if the order of the readings is actually meaningful in jmdict file.. doc seems to imply that the first reading is the main one
need to update schema to include this information somehow, the
pri
field is not enough ... or look into if the default order of the data in the table is sufficient to solve this. This is a good time to remember some of this could be handled by PHP and more flexible. Just get all the results and sort / filter it in php (plus this should be cached at some point).