Closed cdchapman closed 5 years ago
A similar situation and a workaround came up in an early ruby-ffi
discussion. I'll add another commit to use this approach.
When FFI attempts to bind a function that doesn't exist, it raises a FFI::NotFoundError
exception. You could catch that when attaching the additional functions, and fail silently?
@cdchapman, thank you, this is exactly functionality that I need! @postmodern, what should be done yet in this PR to get it merged?
One more idea for ease of use is to make add_dic
to search for dictionaries in Hunspell.directories
as FFI::Hunspell::Dict.open
method does to ease of adding other languages dictionaries.
My use case: I want to check words with russian, english and my custom dictionary (with my domain words).
For now with the code from this pull request I can do it as:
require 'ffi/hunspell'
dict = FFI::Hunspell.dict('ru_RU')
dict.add_dic('/usr/share/hunspell/en_US.dic') # ← I want it to be just en_US, I don't want to stick to any one path (which is differs between distros, etc)
dict.add_dic('/full/path/to/custom.dic') # dxg/MS is here
# Usage:
dict.check?('собака'.encode(dict.encoding)) # => true
dict.check?('dog'.encode(dict.encoding)) # => true
dict.check?('dxg'.encode(dict.encoding)) # => true
I can continue work on this PR with your permission. WDYT?
Sorry for the delay. Only one minor issue, but this looks ready to be merged.
Hi @Envek. Sorry for the delay. It would not make sense to use the russian affix file, for example, with the en_US
dictionary because the affixes for different languages mean different things. Extra dictionaries reuse the affix file of the main dictionary. See hunspell/hunspell#348.
A better approach would be the following:
require 'ffi/hunspell'
dict = FFI::Hunspell.dict('ru_RU')
dict = dict.add_dic('/full/path/to/extra_russian_words.dic') # medical terms are here
english_dict = FFI::Hunspell.dict('en_US')
english_dict.add_dic('/full/path/to/custom.dic') # dxg/MS is here
# Usage:
dict.check?('собака'.encode(dict.encoding)) # => true
dict.check?('полиглактином'.encode(dict.encoding)) # => true
english_dict.check?('dog'.encode(dict.encoding)) # => true
english_dict.check?('dxg'.encode(dict.encoding)) # => true
It is useful anyhow to know what language is being checked. An application could guess the language using some sort of heuristics, but it is good to be explicit about the guessing because it may choose the wrong dictionary.
Yes, I figured that and finally I did exactly this: have two main dictionaries for every language and two additional dictionaries (one for each language).
But anyway this PR is still needed.
Been busy at work. Finally got around to working on ffi-hunspell. Merged and will be in 0.6.0.
Is it possible by chance to release a new version to offer this new feature?
FYI this feature was released in 0.6.0, released last November. https://github.com/postmodern/ffi-hunspell/blob/master/ChangeLog.md#060--2020-11-28
Requires at least hunspell version 1.3.4 see the commit when this interface was introduced. I don't know whether there is a way to attach the function only for specific versions of hunspell.