wareya / nazeka

Nazeka is a rikai replacement
https://addons.mozilla.org/en-US/firefox/addon/nazeka/
49 stars 8 forks source link

Enhancement - Futur EPWING support and Kenkyusha #11

Closed epistularum closed 5 years ago

epistularum commented 5 years ago

Something I've always wished from any pop up dictionary is a good support for the kenkyusha waei dictionary. The best support at the moment is still rikaisama and a lot of people stick to firefox 57/waterfox/palemoon... for the sole purpose of rikaisama and its regex deletion.

It's a good occasion to improve on rikaisama's EPWING feature (especially for kenkyusha due to its large number of of examples and what not).

For the moment with rikaisama and its regex feature we can clean up the definition field BUT in this process we're forced to delete all the examples. I've been trying to find a fix using regex but I'm not competent enough make a breakthrough, here's what I figured out so far.

Example of an entry :

まにあう【間に合う】 ローマ(maniau)
1 〔時間に遅れない〕 be in time 《for…》.
▲7 時の列車に間に合う catch [make] the 7 o'clock train
・締め切りに間に合う meet the deadline
・開演に間に合う arrive before curtain time
▲9 時の札幌行きに間に合うように空港に着いた. I arrived in time for the nine o'clock flight to Sapporo.
・「間に合うかな」「走っても間に合いそうにないね」 "Will we be in time?"―"It doesn't look like we'll be in time even if we run."
2 〔役に立つ〕 answer [serve, suit, meet] the purpose; be useful; be serviceable; be of use [service]; be good enough; 〔十分である〕 be enough; 〔用意ができる〕 be ready; 〔必要をみたす〕 meet the requirements; serve the [one's] turn [need].
▲「費用はどのぐらいかな」「5 万もあれば間に合うよ」 "And what is the expense?"―"Fifty-thousand yen should cover it."
・これだけあれば丸 1 年は間に合う. This will last us [see us through] one whole year. | This will be enough for a whole year.

Where all entries starting with "▲" or "・" are examples and all entries matching this regex are definitions :

Regular expression that matches everything that is not a definition : \n[^″*〖〈《⇒=➡【〔(〜A-Za-z0-9].*

Regular expression that matches definitions+one line below : \n[″*〖〈《⇒=➡【〔(〜A-Za-z0-9].*\n.*

The perfect result should look like this(keeping one example for each definition) :

まにあう【間に合う】 ローマ(maniau)
1 〔時間に遅れない〕 be in time 《for…》.
▲7 時の列車に間に合う catch [make] the 7 o'clock train
2 〔役に立つ〕 answer [serve, suit, meet] the purpose; be useful; be serviceable; be of use [service]; be good enough; 〔十分である〕 be enough; 〔用意ができる〕 be ready; 〔必要をみたす〕 meet the requirements; serve the [one's] turn [need].
▲「費用はどのぐらいかな」「5 万もあれば間に合うよ」 "And what is the expense?"―"Fifty-thousand yen should cover it."

tldr : adding support for kenkyusha while keeping only 1 example for each definition would be a godsend for anyone learning Japanese and will finally achieve a breakthrough in the dictionary pop up app nonsense.

wareya commented 5 years ago

Yeah rikaisama's per-line regex filtering wasn't a very good design IMO.

Nazeka already has support for auxiliary json dictionaries but you have to know how to write a converter to use the feature. It's not ideal. I'll get around to writing epwing importers eventually.

epistularum commented 5 years ago

It will be insanely helpful if you can achieve a result that keeps one example for each definition. Where can I buy you a coffee ?

wareya commented 5 years ago

This will be nazeka_epwing_converter's job once dictionaries other than shinmeikai are supported in it. You can open a new issue on its repository if you want.

https://github.com/wareya/nazeka_epwing_converter