Closed GrayedFox closed 4 years ago
Note: I extend the native array prototype with a custom remove() method (if Array.prototype.remove is undefined) just in case you spot that and rightly point out there is normally no such method.
Hi there Che! Thanks for the kinds words!
I think ownWords
should be several lists: one per language.
Because a word I add to Dutch, shouldn’t be a word in English.
If you’re not dealign with true “personal” dictionaries, I’d say use .add
(and .remove
) directly! Could that work for your case
My pleasure :)
Hmmm you make a fair point that a custom word should perhaps keep it's language as context. From a slightly more user as opposed to linguistic perspective: I imagine it won't necessarily bother someone to see that after they have added "shizzle" to their custom word list, shizzle is now marked correctly in Dutch and English and any other language they employ, if the point of adding the word is just to make sure it never gets marked as incorrect. Admittedly I'm assuming a lot on the users behalf here :shrimp:
At the moment, using .add
and .remove
(which I was doing before) is a bit tricky since it results in the same bug, and I was hoping .personal
might resolve the issue somehow.
Lets say someone is using English and Dutch spellers, the bug reproduction steps would be the following:
I was under the impression that the users personal dictionary was sort of there to allow users to populate a personal dictionary of custom words without touching the core language dictionaries. By using nspell.personal
, I can nearly get the desired effect: the code detects the content language and matches that with an nspell instance, then we call nspell.correct
which also checks any words added via nspell.personal
. Problematically, there doesn't seem to be any way to remove words from the personal list once they are added - unless I'm missing something!
I think I can work around this behaviour by changing the custom word list to be an object of arrays that looks like this (and checking which in use languages the word is correct in when adding it to the users list of custom words):
{
shizzle: ['de-de', 'en-au', 'nl-nl'],
geluk: ['de-de', 'en-au']
}
Happy to do this, as it's not too much trouble, but then I think I am not sure what the purpose of .personal
is to be honest :)
Right, I think I understand.
Unfortunately, this is not something that exists in Hunspell, or projects based on it. There is no revision history built in that can be undone in steps.
Hunspell understands one language at a time. Such as English, based on a generated dictionary (see wooorm/dictionaries). Users do have some extra control: they can have an extra file, in the “personal” format, that adds or prohibits some words. On a Mac, for AppleSpell, that’s located in ~/Library/Spelling/en_US
in this case. There is also an extra file that works in any language: ~/Library/Spelling/LocalDictionary
.
But when any information changes, in the language dictionary, or in one of the personal dictionaries, AppleSpell must be restarted! (related: https://github.com/wooorm/osx-learn).
For a solution, maybe re-initialize the instance when someone closes the dialog?
Ah okay, thanks that has helped me understand. Personal seems like it is there so that a list of user words can be kept somewhere on disk and ported around, but it's only ever read (or set) once.
I worked around the issue and now use the following code (which has helped me find a new bug!) but I will close this issue and leave the workaround code here just in case future eye balls:
// add word to user.ownWords and update existing spell checker instances
addWord (word, languages) {
const misspeltLangs = []
languages.forEach(language => {
const speller = this.spellers[language]
if (!speller.correct(word)) {
speller.add(word)
misspeltLangs.push(language)
}
})
if (misspeltLangs.length > 0) this.ownWords[word] = misspeltLangs
}
// remove word from user.ownWords and update existing spell checker instances
removeWord (word) {
this.ownWords[word].forEach(language => {
if (this.spellers[language]) this.spellers[language].remove(word)
})
delete this.ownWords[word]
}
The above code is just a sample for one way to manage adding and removing words from an nspell instance (if you are using multiple, as I am) in a class. Here, the class instantiates nspell instances for multiple languages and keeps track of which languages the custom word is actually misspelt in, so that calling user.removeWord()
only removes the custom word from languages it was never originally correct in.
Hello there Titus Wormer, first of all a big thanks! Thanks to your native JavaScript Hunspell wrapper I have been able to develop a smart language detecting spell checker plugin without doing any of the NLP heavy lifting. Awesome, thank you, keep up the great work!
Now for the fun part :smile_cat:: I'm using
nspell.personal()
to add a list of user added words from browser storage.Here are the offending lines from my user class:
ownWords is just an array of strings with no affix rules or slashes for modelling - just words users think are words (strings cleaned of numeric and nonword characters). I would like to know how to overwrite the existing personal dictionary, or alternatively, how to remove a word from a personal dictionary once it has been added.
nspell.personal.remove()
is undefined (as expected and per the docs), from what I understand of the personal file, is thenspell.personal()
method actually just a wrapper aroundnspell.add()
?Referring to this: https://github.com/wooorm/nspell/blob/c92902162193084027a5a8b7259702af90804528/lib/personal.js#L41
To provide the final bit of context: I was originally just adding all words from user.ownWords to each nspell instance with
nspell.add()
. The problem is if a user adds the word "dark" and has British English, American English, and German spellers enabled, that word is added to all of those dictionaries. When the user removes the word, it is also removed from the British and English dictionaries and now "dark" is marked as incorrect in those languages. Enternspell.personal()
. Since I don't persist changes to core dictionaries, the bug thankfully goes away after restarting the plugin/browser, but I'm digging deep on this one!One of the features of the extension is that users can add a word in one language to have it marked as correct in all the languages they use, and I'd love to keep it that way.