Predelnik / DSpellCheck

Notepad++ Spell-checking Plug-in
GNU General Public License v2.0
198 stars 32 forks source link

Fill autocomplete lists from selected dictionaries #305

Open Swatirohatgi opened 2 years ago

Swatirohatgi commented 2 years ago

I am looking for a solution to use the existing word captured in the dictionary. Currently the spell check validates the correctness of the word already used but it does not populate the help for auto complete when you use the already captured word in the new document.

Thanks

Predelnik commented 2 years ago

If you mean Notepad++'s auto-complete I believe the biggest obstacle would be the inability to modify auto-completion within Notepad++ via plugins. The second problem is to figure out if spell-checking dictionaries are suitable for this functionality because auto-completing with suggestion list for example might not produce the desired outcome but adding whole dictionary to list of auto-completions might result in a performance problem.

molsonkiko commented 1 year ago

The key here would definitely be storing the dictionary in a trie data structure and pre-selecting all possible auto-completions using the trie before passing the list into Scintilla's AutoCShow method. Tries are perfectly designed for efficient auto-completion and can solve this sort of problem efficiently even in Python.

For a case-insensitive trie (needed to allow case-insensitive lookup in a dictionary), the implementation is a bit more complicated but not too bad.

I'm working on implementing this basic functionality in PythonScript and maybe later I can circle back and see about trying to implement it here in C++.

Here's a screenshot of the dictionary autocompletion script suggesting words from the 99369-word Merriam-Webster English Dictionary: image It's quite responsive even when implemented in Python. The main disadvantage of tries is that they eat A LOT of memory (this 99369-word trie consumes about 100MB).

pryrt commented 1 year ago

@Predelnik ,

FYI: @molsonkiko was prompted to this follow-on from this N++ Community discussion, which includes another user "from the wild" who is interested in this feature.

My further comment would be: to save memory (in @molsonkiko's suggestion), and to improve performance in general, you could not use (or populate) your autocomplete dictionary, except when the user has that option enabled on your plugin.

And if you cannot get access to N++'s shortcut to trigger auto-completion (that is, if you cannot append to Notepad++'s list when N++ tries to autocomplete), you could have your own menu entry (which the user could define a shortcut for using Shortcut Mapper), so that if the user wanted, they could do their own dictionary-auto-complete on-demand with their own favorite shortcut.

vinsworldcom commented 1 year ago

I have not been able to append to Notepad++ auto-complete lists, but you use the Scintilla SCI_AUTOCSHOW and push your own list. I imagine using the entire dictionary would just about always find some possible word so you'd be in a sense overriding most all of Notepad++ autocompletion.

By contrast, I have a few plugins (QuickText and TagLEET) that do autocomplete and I have some PythonScripts injecting autocomplete too (for a Python IDE-like feature) but their lists are very small so after typing a few characters and exhausting the possibilities in my "custom" lists, the Notepad++ "native" autocompletion for the given language (or word complete from current document) is triggered.

Cheers.

molsonkiko commented 1 year ago

My autocompletion thing with PythonScript is implemented here if people are interested. This can provide some of the desired functionality discussed above.

At present it cannot properly parse a Hunspell word list. I may add that feature later.

Predelnik commented 11 months ago

Well, as it turns out, apparently generating list of all possible word forms from .aff/.dic is a bit of a mess right now in hunspell (see https://github.com/hunspell/hunspell/issues/404). So even without having additional data structures, parsing existing dictionaries for the list of words presents quite a task to overcome at first, as it seems.