pombreda / mozc

Automatically exported from code.google.com/p/mozc
0 stars 0 forks source link

Wrong hinsi entries #61

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Maybe they have wrong 品詞.

=============================
どーん   28  28  4254    Dawn
じーん   28  28  4245    Jean
ぎゅっ   28  28  4179    Gyuu
うんうん    16  16  4302    Unun
どれっど    2990    2249    5850    Dread
あおん   1364    1364    5767    AON
がば  32  32  6272    GABA
がんば   1310    1310    6552    GAMBA
がんば   1310    1310    6563    GANBA
くら  2594    2594    4486    KURA
じょしょ    613 613 768 JOJO
たたい   1364    1364    5128    tatai
つん  28  28  4776    TUN
どど  24  24  5042    dodo
どりんく    2990    2244    6627    DRINK
どるーぷ    2990    2249    7399    DROOP
どろー   2990    2249    7488    Draw
どろー   2990    2249    7495    DORO
なう  2589    2589    7886    NOW
ふーふー    28  28  4405    fufu
わかい   2676    2676    5482    wakai
=============================

Example:
どーん   28  28  4254    Dawn
28 副詞,助詞類接続,*,*,*,*,4

"Dawn" is not a 副詞 word.

Original issue reported on code.google.com by heathros...@gmail.com on 25 Oct 2010 at 5:05

GoogleCodeExporter commented 9 years ago
What is the real problem of this issue?  Do these entries produce any real mis 
conversions?

Actually, the POS (part of speech) of "どーん/Dawn" is expected.  The POS 
stems from the Katakana entry "どーん/ドーン". The POS of Katakana to 
English transliteration is basically inherited from the corresponding Katakana 
entry in order to control the ranking (position) of English transliteration. In 
other words,  Mozc's POS does not always correspond to grammatical/syntactic 
POS defined in paper dictionaries.  What user really cares about is the quality 
of the conversion. Ordinal user are not aware of which POS is assigned to each 
entry.  Mozc's POS tag set and POS assignments are designed so that overall  
mis conversions are minimized. 

Original comment by t...@google.com on 27 Oct 2010 at 4:14

GoogleCodeExporter commented 9 years ago

Original comment by yukawa@google.com on 1 Apr 2012 at 2:25

GoogleCodeExporter commented 9 years ago

Original comment by yukawa@google.com on 1 Apr 2012 at 2:29