marcoagpinto / aoo-mozilla-en-dict

English Dictionaries Project (AOO+Mozilla+others)
159 stars 24 forks source link

Is it truly no way to restrict Firefox en-GB dictionary to check words composed of latin characters only? #49

Closed billyswong closed 1 year ago

billyswong commented 2 years ago

From https://proofingtoolgui.org/faq.html

5) Firefox tries to spellcheck non-English words, e.g., Chinese, when using en-GB When you activate a dictionary, it will compare all words being written with the ones in its wordlist. The spellchecker doesn't know the real language being used by humans. In this case, disable spellchecking while writing in Chinese (right-click in an editable text area and uncheck “Check Spelling”) or activate the Chinese dictionary if available.

The problem is, Chinese writing don't "spell" and thus there will never be a spellchecker. At first, I nearly thought it is a bug of Firefox. But then I found if I only enable the default "English (United States)" dictionary, the red underline on Chinese paragraphs is gone. Could there be any flag that one can mark to a dictionary, such that Firefox will skip applying the dictionary onto writings composed of non-Latin characters? It currently auto skip Chinese paragraphs from red-lining if they are prefixed by Latin alphabets. So there is definitely some interesting mechanism in action.

marcoagpinto commented 2 years ago

Hello!

I have no clue on how to fix that.

The dictionary is basically two text files:

  1. .aff
  2. .dic

Could you open a ticket in Firefox's Bugzilla?: https://bugzilla.mozilla.org

We will see what they reply.

billyswong commented 2 years ago

I found this bug is filed 5 years ago and covered with dust. https://bugzilla.mozilla.org/show_bug.cgi?id=1418169

I added my own take into that bug report but hope is low.

marcoagpinto commented 2 years ago

😭

I, too, filed a bug years ago regarding the PDFs rendering, if I well remember, and only now they are fixing it.

billyswong commented 2 years ago

A trick is described in https://bugzilla.mozilla.org/show_bug.cgi?id=1162823 (7 years ago!). They marked the dictionary SET ISO8859-1 to circumvent the issue for en-US. Some people try to fix it properly in https://bugzilla.mozilla.org/show_bug.cgi?id=1164263 but that didn't work out.

billyswong commented 2 years ago

That SET ISO8859-1 looks doable for en-GB. We could search if there are any words that are outside that range (I saw the single Greek letter entries at the end of .dic file but we could remove them as they are not needed after SET ISO8859-1). If no actual words require characters outside ISO-8859-1, then we may convert the files and give it a try.

marcoagpinto commented 2 years ago

The tool I created to edit dictionaries only works with UTF-8.

And Firefox only accepts signed add-ons, so I would have to submit the new dictionary on the platform.

It is also a risk to make that kind of downgrade.

Now-a-days, everything should use UTF-8.

But you may try downloading the files from my GitHub and change the ISO in the .aff and maybe with some luck Firefox won't complain about dictionaries regarding the signing.

It is a matter of trying.

I am already releasing also legacy versions of the dictionary and I don't want now to have to make WebExtensions, legacies and iso versions of them, it is too much work.

billyswong commented 2 years ago

https://connect.mozilla.org/t5/ideas/let-dictionary-specify-character-range-to-spellcheck-and-ignore/idi-p/9765 "Idea" posted to Mozilla Connect, in the hope of reaching more people and raising chance of being seen by Firefox developers.