Open hosseinalipour opened 7 years ago
Hi, I know very little about your language so I don't know how to distinguish the capital letters (except English is very simple, just use .lower() function). Do you have any interest to contribute some codes?
I may able to fix it but I don't know where exactly in the code I have to edit??
You have to review the codes, especially the mdxservice.
unfortunately, I couldn't find the section required, Could you please tell me where is the property name that contains the word?(where you would use .lower()
)
I'll do the rest myself...
I took the .lower() example just to say English capital letter rule is simple, but actually I don’t use that because sometimes the capitalized words and normal words have different meanings, such as “china” and “China”.
In your details, I would suggest you notice the “query.py” and “base.py” in service package, either improve the word purified strategy or improve the mdx service. The function names or class names or other stuff may help you to improve that.
I'm not familiar with this code and there is no commenting,I need at least the line number and the property name to contribute. there are a lot of dictionaries that only contains lowercased words database, for example, for Merriam dictionary, china
is available and not China
. maybe is some other dictionaries only China
is available. so it's better if we check if the word has capital in it, if so then query that word itself, if there is no result, then query the lowercased word...
for turkish is the same as the English language. for the Persian language, the special syllables must be Always removed before the word to be found.
I suggest you focus on the "inspect_note" function in query.py. I think you should know what the function is doing once you review that.
I just realized any word with a capital letter in it does not get queried. perhaps it must be happening on Unicode characters too, for example, ş and Ş are different but one is capital and one is not, but I haven't tested it yet... another example would be قٌربان
which is a Persian word, that contains special vowels that should be removed before querying.