Doublevil / JmdictFurigana

A Japanese dictionary resource that attaches furigana to individual words
147 stars 13 forks source link

Entries containing the word "協会" (an example out of many!) have the wrong furiganas. #6

Closed yayoo1971 closed 7 years ago

yayoo1971 commented 7 years ago

"協会" <-> "0-1:きょうかい" --> CORRECT "協会" <-> "0:きょう;1:かい" --> INCORRECT

Doublevil commented 7 years ago

Hello again yayoo1971, This is actually not an error. I'm currently rewriting part of the readme to explain the aim of the project in a better way. I'll commit my work asap.

yayoo1971 commented 7 years ago

Alright, I'll check it.

Basically... I've done two sketches in the spirit of your README.

CORRECT 2

INCORRECT 1

Difference is subtle but very important.

BlueRaja commented 7 years ago

Sorry, I am still a Japanese beginner, but.. why is the difference important? Why would it be beneficial to consider a word as a special reading when it doesn't have to be?

dgoedkoop commented 7 years ago

The "correct" way makes it impossible to do searches, for example, for all words containing 会 pronounced as かい.

Am 14.11.2016 9:14 nachm. schrieb "BlueRaja" notifications@github.com:

Sorry, I am still a Japanese beginner, but.. why is the difference important? Why would it be beneficial to consider a word as a special reading when it doesn't have to be?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Doublevil/JmdictFurigana/issues/6#issuecomment-260448983, or mute the thread https://github.com/notifications/unsubscribe-auth/AHQA5y08DcxP8eVz-4Q_VZvMrlgTCP9Zks5q-MEygaJpZM4Kxucw .

Doublevil commented 7 years ago

So basically what you explain is "wrong" is the whole aim of the project. I think JmdictFurigana is not the right resource for what you are trying to do. I updated the readme, I hope my update makes it clearer what this project is and what it's not. I'll keep the issue open for a while and close it when everyone agrees or when the discussion becomes unproductive.

yayoo1971 commented 7 years ago

--> The "correct" way makes it impossible to do searches, for example, for all words containing 会 pronounced as かい.

Well, I don't get you as you can do such searches without jmdictfurigana in the first place.

会 kunyomi : あ.う onyomi : カイ, エ

For searching for a kanji, part of a word, based on its readings... it is quite straightforward with a simple sql table "words" with only these columns :

kanji : 協会
kana : きょうかい kanjis_list : 協, 会

you get the readings via a query or simply by building a feature such as when clicking on the concerned kanji you get the reading + initiate a search (and can be even prompted if you want the search to be based on the "onyomi" or "kunyomi") to get all of the words containing it.

--> select * from words where kanjis_list LIKE "%your_kanji%" AND kana LIKE "%your_reading%" (or also, without the column kanjis_list : select * from words where kanji LIKE "%your_kanji%" and kana LIKE "%your_reading%"

Anyway, Doublevil is right ; this is a non issue and it can be closed.