Open georgthegreat opened 11 years ago
Thank you very much, it was important though very funny bug :D
Basically the first problem that the name of this dictionary wasn't resolved correctly, mostly because zip file with it have name "fr_FR-1990_1-3-2" but dictionary itself have name fr_FR-1990.
It somehow turned out that in this list box items were sorted automatically, though they shouldn't be and that sorting was case insensitive while mine was sensitive so basically correspondence between dictionaries and check boxes was wrong.
Makes me think if I probably should make case insensitive sorting too though...
Are you going to close this? Or are you waiting for me to test it?
Works well (I've downloaded new version from some issue above).
Not finally fixed. If I turn on all three dictionaries (via multiple languages option), some words wouldn't have any alternatives: ômbre, räie maybe some more.
If only french is turned on, everything works fine (they will be underlined and alternatives like ombré and raie will be available).
Sorry, räie word seems to work fine.
It was actually because of totally different matter, at very beginning I wrote it so Hunspell would have 100% hit on Russian/English dictionaries combination. Hunspell is much better about language guessing done my way, so it could be safely removed. You could check it out at the usual link: http://goo.gl/OYqRO
Still isn't working for me.
The word "developpé", which is of French origin, suggests only English alternatives, though accents aren't used in English words.
Why do you need this language definition at all?
Well the current way of determining language guess is to choose one which have most suggestions, for this word we get 2 suggestions for English and 2 for French, and since English is first -- it's being selected as current.
The good way to solve this, maybe - if multiple languages selected - show another menu item where you can select language for this word, so all suggestions and adding to dictionary would be for this language. Probably it's better to do so for current session only 'cause saving a lot of stuff like that is pain, at least it will add possibility to add such words to dictionary and forget about them for the time being.
Isn't it possible to simply join the suggestions in one list?
It's possible but there is the problem when there is a lot of suggestions, in which order feed them to a list. That of course have a solution of just putting one from first language, one from second and so on (if they have them at all) until maximum is reached.
But there's still a problem of determining in which language user dictionary I should put the word to, maybe though it could be solved by doing "Add to Dictionary..." item as a submenu with languages selected as items. Probably with showing how much suggestions from each language there are (in parenthesis)
Ok that seems like a good idea, most likely I'll do it))
Hunspell doesn't have any kind of difference between words?
Are you sure that non-unified user dictionary is required? Libre/Open Offices don't have such feature, do they?
What do you mean by difference? If you mean like distance function between words, well it's not public definitely, I could try to look for it though.
I don't know if it's like required 100%, but it seems to be logical actually, since there are users who switch between languages to check the text rather than use multiple languages.
Yep, the distance function is what I was talking about.
Here is one more example of bad usability:
When both English and French dictionaries are turned on, the word reunis
suggests English reunion
, but not French réunis
, which should be much more close to the original.
This also might be caused by wrong utf-8 handling (réunis
is something like r'eunis
in utf-8).
Btw if it wouldn't bother you, you can check this preview of next major version http://goo.gl/OYqRO I used Damerau–Levenshtein distance for the words (case-insensitive), it isn't perfect but seems to be actually quite OK, though maybe I'll change some things later. All your example problems from this thread seem to be resolved at least)
Different dictionaries for different languages are preserved for now, but default mode now is different dictionaries for single dictionary mode and one big dictionary for multiple dictionary mode (I didn't test it thoroughly for now though)
Also - not checking of words being written like in Firefox was added in this version also (as an option but turned on by default)
No problem. I'll look on it, but not right now. I think I'll post the answer in a couple of days.
Seems that this update isn't working at all.
I entered french word entree
(correct is entrée
).
List suggests:
It's working but all this words sadly has equal distance from entree, which is 1.
Hm... Then this metric (Damerau–Levenshtein) doesn't fit, does it? As far, as I see, editing, inserting or deleting single letter — all have the save weight. Seems to be incorrect. Is it your implementation or some library function? It is possible to edit weights?
Most likely it's possible but I need to look deeper for now I just copy-pasted some algorithm for my needs))
Well sadly even if I change the cost of operations to make substitution cheapest there are 3 words with the same distance entrer, entres, entrez, and since I sort them alphabetically it, entrée end up being last of them, while Hunspell manages to successfuly place it first.
Well I think it would be better to have correct weights for each letter ( like to make exchange of similar or close by keyboard letters to be cheapest operation) but not sure that this thing that is very easy to do.
Actually I've had some ideas about slight modifying of Hunspell source to allow me the merge of it's lists of suggestions, maybe I try that also.
Not sure how to test it all though, I only have some tests of common misspellings from the Aspell site, but they are not 100% reliable))
I think that there is no "correct" method — any algorithm would have exceptions.
Yeah you're right, but with having good statistics about common misspelling all this stuff could be optimized further and further to nearly perfect)) Well at least it all deserves a little bit more of attention from my side, thanks for an example where all goes wrong)
I've french_1990 and standard English (Great Britain dictionaries installed). I write the single word l'été: http://slovari.yandex.ru/l%27%C3%A9t%C3%A9/%D0%BF%D0%B5%D1%80%D0%B5%D0%B2%D0%BE%D0%B4/
Here is what I see. English dictionary fails to validate it (that's OK). French dictionary validates it (that's OK). Joined English-French dictionary fails to validate it (that's NOT OK).