Open Richard57 opened 5 years ago
I concur with this assertion. The target language may NFC or NFD characters, and makeoxt should be agnostic and hands-off about this. In my case the NFC normalization breaks LO's ability to correctly identify words that use NFD characters because my AFF file does use ICONV, as @Richard57 suggests, and my DIC files uses NFD characters.
@n8marti I'm currently looking at this. I'd love to have some simple test data, say your AFF file and a DIC file with six words that include NFD characters. I can then make a test file using the words from the DIC file.
I had to rename the files b/c github doesn't like the non-txt extensions. I've made some other changes to these files since I last built my OXT extension, but I think they will still exhibit the problem if you build it with makeoxt.
Commit 461379c attempts to address this issue
-n None
parameter is added to bypass the default NFC normalizationIn addition, some changes were made to the documentation. I hope it's okay to have included sg-CF in an example. Thanks, @n8marti, for the sample files.)
I have not yet built the Windows executable, but this should work on Linux. Any feedback welcomed.
makeoxt.exe available in zip file at https://github.com/silnrsi/oxttools/releases/tag/v0.6
Great. This (linux version) works for me now, thanks.
Normalising the text of the Hunspell dictionary and affix files is inappropriate.
This normalisation is performed by function zipnfcfile() in script
makeoxt
.