Open 1ec5 opened 10 years ago
The extension hooks assume you’ll handle everything (such as highlighting) yourself, so the code would be rather involved. I need to look at whether XUL/Migemo does something simpler.
piroor/xulmigemo isn’t exactly simple, but perhaps I can distill it to just the functionality needed for Vietnamese.
https://github.com/mozilla/pdf.js/blob/master/extensions/firefox/content/PdfjsChromeUtils.jsm https://github.com/mozilla/pdf.js/blob/master/extensions/firefox/content/PdfStreamConverter.jsm https://dxr.mozilla.org/mozilla-central/source/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/Finder.jsm#662 https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/Finder.jsm#632 https://bugzilla.mozilla.org/show_bug.cgi?id=1226963
I’m getting closer to a working implementation using a NodeIterator
and a function that replaces each vowel with a character class that represents all its precomposed variants. Naturally, NodeIterator
isn’t quite as fast as the native nsIFind
implementation, but the performance hit is much less noticeable in e10s windows. Once this is done, I’d like to contribute something along these lines to FindBar Tweak to fix Quicksaver/FindBar-Tweak#56.
Work is continuing on the find-fold-28 branch. Highlighting is now diacritic-folded, but find and find previous/next are still unimplemented. There will also need to be UI to disable this feature, in case the user doesn’t want diacritic folding or is using a potentially incompatible extension like Migemo or FindBar Tweak.
Current status:
<b>
tag)xóa
versus xoá
)ễ
versus ễ
)And of course, I just realized that there’s been movement on bug 202251 within the past few months. If Firefox gains built-in diacritic folding, all this work could be moot, and I can move on to #56 and maybe reuse the finder script for a regular expression find extension. :joy:
But it all depends on whether it handles the case where ă
should match ắ
but not a
. WebKit and Chromium get this wrong, because they strip all diacritics from both source and query strings for comparison purposes, as the current patch in 202251 does. This comment indicates that Mozilla is at least aware of the need for more nuanced folding.
Bug 1,353,790 would provide a formal way for a WebExtensions-based addon to provide synonyms for searches instead of having to reinvent the find bar wheel.
Firefox introduced event-based extension hooks to the find bar so that pdf.js can search PDFs. It would be really neat if AVIM could customize in-page find to ignore diacritics until diacritics are added to the search terms. The find engine would probably involve querying for text nodes that match a certain regular expression.