silnrsi / oxttools

Tools for creating language support oxt extensions for LibreOffice
MIT License
6 stars 6 forks source link

Unable to test the extension #16

Closed sinaahmadi closed 3 years ago

sinaahmadi commented 3 years ago

Hi, Thanks for this nice tool. I am trying to create the extension for the Kurdish Hunspell Project. I am not able to test the extension on OpenOffice or LibreOffice, as both programs raise an error and get closed automatically.

Would you please let me know if there is another way to create the same extension for other applications, like Mozilla? Is it normal that the .oxt output file is 3kB while the initial .dic and .aff together are more than 1MB?

Thanks :-)

DavidLRowe commented 3 years ago

Thanks for your input. It turns out that the existing script (and v0.3 of makeoxt) had some issues that prevented building an extension based on Hunspell dictionary input. Hopefully that is fixed in v0.4 (see https://github.com/silnrsi/oxttools/releases/download/v0.4/makeoxt.zip).

The fact that the .oxt file was so small is an indication that there was a failure building it.

Before adding the extension, following the instructions in USAGE.md, with a blank LibreOffice writer document open, select the Format menu, Character menu item, Font tab. Under "CTL Font", observe that the language name ("Kurdish, Central (Iraq)") is in the list. image

Now build the extension (using corrected the makeoxt.exe file from the v0.4 release): makeoxt.exe -t rtl -l "Kurdish, Central" -d ckb-Arab.aff --dicttype hunspell ckb-IQ ckbdic.oxt

In a LibreOffice writer document, use Tools menu, Extension Manager menu item and then load the ckbdic.oxt that was just created. Now repeat the process above. image

Note that the "Kurdish, Central (Iraq)" now has the "ab✔" icon indicating there is spell checking available.

Note: Building the extension using "ckb" rather than "ckb-IQ", gives a fifth "Kurdish" entry ("Kurdish, Central" with no country indication) that has the "ab✔" icon.

Let us know if this works for you for LibreOffice. Other programs may be able to use the Hunspell (.dic and .aff) files directly if they are placed where that program can find them. The purpose of this extension is to add the language name information along with the dictionary files so that LibreOffice can apply the language tag to text, then spell check it.

sinaahmadi commented 3 years ago

Dear @DavidLRowe , Thanks for this amazing explanation. I am thrilled to use it, as I spent many days figuring out how to create the plugin and submit it to LibreOffice! Your work is so invaluable. Thanks!

A silly issue that I have is that I am a macOS user and therefore, not able to run the makeoxt.exe file. Would you have any suggestions before I install a virtual machine? When do you think the current version of the repository will be fixed?

DavidLRowe commented 3 years ago

The updates to makeoxt have been included in the oxttools repository (and the v0.4 release).

Installing all the dependencies to run the Python script can be daunting. That's why we've packaged makeoxt.exe for use on Windows. I'm afraid I don't have any Mac experience to guide you, but it seems that installation should be similar to what is needed for Linux (see the USAGE.md file https://github.com/silnrsi/oxttools/blob/master/docs/USAGE.md). You'll need to be able to run Python 3 at the command line and install all the dependencies.

Here's the ckbdic.oxt file that I created using the command line listed above (with ckb-IQ as the language tag), inside a .zip file. ckbdic.zip

sinaahmadi commented 3 years ago

Thanks a million, @DavidLRowe. I re-installed your repository today but got the same issue. I guess the makeoxt.exe file is the only updated version and not the repository.

I'll try to find a solution to run the code on Windows then.

Thanks again!

DavidLRowe commented 3 years ago

@sinaahmadi Can you post the results you get when you try to use the makeoxt script on you Mac?

sinaahmadi commented 3 years ago

I run this:

makeoxt -d ckb-Arab.dic -a ckb-Arab.aff -l "Central Kurdish (Sorani)" -t rtl --publisher "Sina Ahmadi (ahmadi.sina@outlook.com)" --puburl "https://github.com/sinaahmadi/KurdishHunspell" ckb ckb.oxt

which gives me the same .oxt file that I raised the issue about. Here is the file: ckb.oxt.zip

DavidLRowe commented 3 years ago

Try changing -d ckb-Arab.dic -a ckb-Arab.aff to -d ckb-Arab.aff --dicttype hunspell. When given the .aff file, makeoxt will look for the .dic file as well.

sinaahmadi commented 3 years ago

Now, that looks much better as the extension seems to be working in LibreOffice except that the morphological rules in the .aff file are not applied to the entries in the .dic file, making all the words in my text of test underlines in red as incorrect! I checked the dictionary file and noticed that all the entries there are detected correctly, while any formed of those lemmata (with suffix or prefix) are detected as incorrect!

Sorry that the issue got too long...

DavidLRowe commented 3 years ago

Glad that you got the files loaded. Hope you can successfully track down the issue with the .aff file.

sinaahmadi commented 3 years ago

Thanks again very much, David. I just unzipped the .oxt file which was created by makeoxt. It turns out that bizarrely, a new line is added to each line of the .aff and .dic files making the number of lines twice as the original ones.

This is a part of the .aff file, for example:

SFX B 0 تان .

SFX B 0 یان .

SFX B 0 ەوە .

SFX B 0 دا .

SFX B 0 ڕا .

SFX B 0 ش [ۆەا]

SFX B 0 شم [ۆاە]

SFX B 0 شت [ۆاە]

Do you think this is something due to the encoding or the fact that it's an RTL script?

DavidLRowe commented 3 years ago

I'll try to take a look tomorrow. It may be a bug in how the .oxt file is being built.

DavidLRowe commented 3 years ago

Yep, it was a bug in makeoxt. See https://github.com/silnrsi/oxttools/commit/712c968759f8e05a348691a85f19d9f4d058d2e8 for change.

sinaahmadi commented 3 years ago

Thanks. I'm afraid the problem persists.

sinaahmadi commented 3 years ago

Hi @DavidLRowe, I submitted a pull request that will solve this problem. I tested it and it solved the problem on my files.

Thanks again.