finalion / WordQuery

word fast-querying addon for anki
https://ankiweb.net/shared/info/775418273
GNU General Public License v3.0
291 stars 61 forks source link

Capitalized letter in a word is left with no query result #49

Open hosseinalipour opened 7 years ago

hosseinalipour commented 7 years ago

I just realized any word with a capital letter in it does not get queried. perhaps it must be happening on Unicode characters too, for example, ş and Ş are different but one is capital and one is not, but I haven't tested it yet... another example would be قٌربان
which is a Persian word, that contains special vowels that should be removed before querying.

finalion commented 7 years ago

Hi, I know very little about your language so I don't know how to distinguish the capital letters (except English is very simple, just use .lower() function). Do you have any interest to contribute some codes?

hosseinalipour commented 7 years ago

I may able to fix it but I don't know where exactly in the code I have to edit??

finalion commented 7 years ago

You have to review the codes, especially the mdxservice.

hosseinalipour commented 7 years ago

unfortunately, I couldn't find the section required, Could you please tell me where is the property name that contains the word?(where you would use .lower() ) I'll do the rest myself...

finalion commented 7 years ago

I took the .lower() example just to say English capital letter rule is simple, but actually I don’t use that because sometimes the capitalized words and normal words have different meanings, such as “china” and “China”.

In your details, I would suggest you notice the “query.py” and “base.py” in service package, either improve the word purified strategy or improve the mdx service. The function names or class names or other stuff may help you to improve that.

hosseinalipour commented 7 years ago

I'm not familiar with this code and there is no commenting,I need at least the line number and the property name to contribute. there are a lot of dictionaries that only contains lowercased words database, for example, for Merriam dictionary, china is available and not China. maybe is some other dictionaries only China is available. so it's better if we check if the word has capital in it, if so then query that word itself, if there is no result, then query the lowercased word...

hosseinalipour commented 7 years ago

for turkish is the same as the English language. for the Persian language, the special syllables must be Always removed before the word to be found.

finalion commented 7 years ago

I suggest you focus on the "inspect_note" function in query.py. I think you should know what the function is doing once you review that.