Is there a way to specify which dictionary from which definitions are drawn?

Arkased commented 7 years ago

Is it possible to specify from which dictionaries definitions are generated? I often spend a lot of time deleting name definitions from JMndict, and only really want to use JMdict definitions. (I also posted on Anki.)

carina commented 6 years ago

I had the same issue and I changed meanings.py in this hacky way but it seems to be good enough: Add in line 43:

        #MODIFIED START
        final_meanings_without_duplicates = []
        #MODIFIED END

Find the following line (might be line 98):

        return expression_string, '<br>'.join(meaning_string)

Replace with:

        #MODIFIED START
        # remove duplicate entries
        for item in meaning_string:
            # remove unnecessary definitions (names that don't add any value)
            if "(s," in item or "(f," in item or "(u," in item or "(g," in item or "(p," in item or "(m," in item or  "(s)" in item or "(f)" in item or "(u)" in item or "(g)" in item or "(p)" in item or "(m)" in item:
                continue
            if item not in final_meanings_without_duplicates:
                final_meanings_without_duplicates.append(item)
        return expression_string, '<br>'.join(final_meanings_without_duplicates)
        #MODIFIED END

hsperr commented 6 years ago

@Arkased @carina

I apologize for the year delay in my reply here, I just got aware that people actually posted something on both anki and github. If you still want this changed I am happy to work with you.

@Arkased is the change that @carina proposed what you originally wanted? I am happy to give it a look and put it in.

Arkased commented 6 years ago

First off, thank you for coming back to this project. This addon is extremely helpful to my studies and I am very grateful you have kindly made it available. Any help is greatly appreciated.

The adjustment @carina introduced effectively removes all JMndict entries from consideration (based on limited experimentation, there might be some exceptions I haven't stumbled upon), which helps when I want to only use the main JMdict entries (which is 95% of the time). But there are times I create notes with names (from JMndict) and not normal nouns. As is, the definitions for surnames -- 山口 was an example I tested -- is completely blank. With the adjustment the addon is better to use, but I think being able to specify which dictionary would make it even more helpful.

Another issues comes up with the adjustment. Because the current release adds specific readings for each definition generated (when there are more than one), with the addition, expressions like 京都 have both the JMdict and JMnedict included and the later versions removed, leaving behind only the JMdict definition (which works great), but also the reading, i.e., "きょうと - (n) Kyoto; (P)", whereas the reading is already specified in the appropriate field, and the reading doesn't need to be included in the definition section as well.

In summary, I think the best solution would be to add the option to specify exactly which dictionaries to use (I am imagining it via checkboxes) instead of defaulting to all dictionaries. The issue I described in the second paragraph only arose due to the adjustment made by @carina, which is a slightly annoying side effect of a very helpful improvement.

hsperr commented 6 years ago

@Arkased I see what you say Unfortunately I dont have much time to get into how Qt works in order to make a preferences screen, I currently also don't remember if there is an easy way if selecting the dictionary to search in.

As a intermediate solution that may already be 95% helpful I could implement @carina s idea and just fall back the the normal version if the resulting hits would be empty.

How does that sound?

Arkased commented 6 years ago

Sounds great.

hsperr commented 6 years ago

@Arkased

can you give me a few more test examples and the desired output? I think PR #6 implements carinas change and only returns those "names" etc if there is no other hit. (e.g. Yamaguchi not empty) I tried a few others and it looked ok

Arkased commented 6 years ago

Sorry if I'm misinterpreting your request; wasn't exactly sure what you meant. Regardless, I picked a few JMnedict entries semi-randomly. The results are formatted as [Expression] (input) [Meaning] [Reading] any comments.

山下中 (p) Yamashitanaka 山下中[やましたなか]

上野うえの - (n) section of Tokyo; (P) 上野[うえの] This is a case where the JMdict entry is overriding the JMnedict, which is slightly inconsistent (because it's a proper noun) but I think its fine as is.

三里川 (p) Sanrigawa 三里[みさと] 川[がわ] The expected reading is 三里川[さんりがわ].

上中川 (p) Kaminakagawa 上中[かみなか] 川[がわ] I think the three kanji should be grouped together here, though the reading happens to match.

神奈川沖浪裏おきなみ - (n) offing wave; deep water wave うら - (n) bottom (or another side that is hidden from view); undersurface; opposite side; reverse side; rear; back; behind (the house); lining; inside; out of sight; behind the scenes; proof; opposite (of a prediction, common sense, etc.); inverse (of a hypothesis, etc.); bottom (of an inning); last half (of an inning); (P) 神奈川[かながわ] 沖[おき] 浪[なみ] 裏[うら] Expected is something like: (work) The Great Wave off Kanagawa (woodblock print by Hokusai); The Great Wave; The Wave 神奈川沖浪裏[かながわおきなみうら]

There's probably a way to randomly select entries to test from the dictionary to test, which would probably be a more representative sampling method.

hsperr / japanese_meanings

Is there a way to specify which dictionary from which definitions are drawn? #2