Closed Thomqa closed 6 years ago
I like the idea of supporting unicode, and I see the following challenges:
unic:box
and you would get all unicode symbols with "box" in their description.)Any thoughts?
I found the official files in CSV-like structure. There is also a convenient, dictionary-like index file which might be really useful. I'll look into it.
I created a branch with initial unicode support. It works with a glossary-like text file from unicode.org. These are not all unicode symbols, but it seems to be a reasonable collection of symbols with meaningful short descriptions.
The plugin loads the entire glossary into memory when launched (< 1 MB) and then can search the entered term (non-fuzzy, case-insensitive, also matching substrings) and returns the found symbols. The priority of each of the results is something that needs to be implemented yet (a floating-point number between 0.0 and 1.0 which should be higher the likely this is the result you were looking for). Also, the symbol definitions in krunner-symbolsrc
need to be cut accordingly, so that there are no duplicates.
Nice! I will install it this weekend to test it.
As of now, the plugin (inside the unicode branch) actually supports the entire Unicode database (i.e. it knows all definitions inside the official UnicodeData.txt file). The performance seems okay to me. I also implemented an advanced heuristic to sort the results from most to least relevant (though it might need some additional tweaking).
The features have been merged into the master branch. Unicode support is disabled by default for now, but it can be enabled by a config setting (see the updated README). On that occasion, I have implemented a proper "cascading" configuration, where local definitions / settings will override global ones.
I'm not happy with the heuristic of relevance for the unicode symbols yet; I hope I can improve this soon.
with aliasses and search patterns in the middle of words should also match.