1ec5 / avim

Vietnamese input method extension (IME) for Firefox, Thunderbird, SeaMonkey, Komodo, etc. — bộ gõ tiếng Việt dành cho Firefox, Thunderbird, SeaMonkey, Komodo, …
http://avim.1ec5.org/
MIT License
46 stars 4 forks source link

Diacritic folding for find bars #28

Open 1ec5 opened 10 years ago

1ec5 commented 10 years ago

Firefox introduced event-based extension hooks to the find bar so that pdf.js can search PDFs. It would be really neat if AVIM could customize in-page find to ignore diacritics until diacritics are added to the search terms. The find engine would probably involve querying for text nodes that match a certain regular expression.

1ec5 commented 10 years ago

The extension hooks assume you’ll handle everything (such as highlighting) yourself, so the code would be rather involved. I need to look at whether XUL/Migemo does something simpler.

1ec5 commented 10 years ago

piroor/xulmigemo isn’t exactly simple, but perhaps I can distill it to just the functionality needed for Vietnamese.

1ec5 commented 8 years ago

https://github.com/mozilla/pdf.js/blob/master/extensions/firefox/content/PdfjsChromeUtils.jsm https://github.com/mozilla/pdf.js/blob/master/extensions/firefox/content/PdfStreamConverter.jsm https://dxr.mozilla.org/mozilla-central/source/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/Finder.jsm#662 https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/Finder.jsm#632 https://bugzilla.mozilla.org/show_bug.cgi?id=1226963

1ec5 commented 8 years ago

I’m getting closer to a working implementation using a NodeIterator and a function that replaces each vowel with a character class that represents all its precomposed variants. Naturally, NodeIterator isn’t quite as fast as the native nsIFind implementation, but the performance hit is much less noticeable in e10s windows. Once this is done, I’d like to contribute something along these lines to FindBar Tweak to fix Quicksaver/FindBar-Tweak#56.

1ec5 commented 8 years ago

Work is continuing on the find-fold-28 branch. Highlighting is now diacritic-folded, but find and find previous/next are still unimplemented. There will also need to be UI to disable this feature, in case the user doesn’t want diacritic folding or is using a potentially incompatible extension like Migemo or FindBar Tweak.

1ec5 commented 8 years ago

Current status:

And of course, I just realized that there’s been movement on bug 202251 within the past few months. If Firefox gains built-in diacritic folding, all this work could be moot, and I can move on to #56 and maybe reuse the finder script for a regular expression find extension. :joy:

But it all depends on whether it handles the case where ă should match but not a. WebKit and Chromium get this wrong, because they strip all diacritics from both source and query strings for comparison purposes, as the current patch in 202251 does. This comment indicates that Mozilla is at least aware of the need for more nuanced folding.

1ec5 commented 7 years ago

Bug 1,353,790 would provide a formal way for a WebExtensions-based addon to provide synonyms for searches instead of having to reinvent the find bar wheel.