Open tituomin opened 5 years ago
I have patched our version of the plugin (based on v0.3.0) and added a configuration parameter expandCompounds to optionally support expanding of compound words (yhdyssanat) into separate tokens.
https://github.com/City-of-Helsinki/elasticsearch-analysis-voikko/commit/9a6bd8165e4de3a7d4f8bbe0993ebaec17197f94
I would like to get this feature into master and upstream, if you find it desirable. I can port it to master myself, but currently we are using 0.3.0.
We have found that extracting the parts of compound words is highly desirable in the index analysis stage, for several reasons:
Sounds great, if you'll open a PR I'll look forward into merging it.
@komu here is my attempt at a PR.
I have patched our version of the plugin (based on v0.3.0) and added a configuration parameter expandCompounds to optionally support expanding of compound words (yhdyssanat) into separate tokens.
https://github.com/City-of-Helsinki/elasticsearch-analysis-voikko/commit/9a6bd8165e4de3a7d4f8bbe0993ebaec17197f94
I would like to get this feature into master and upstream, if you find it desirable. I can port it to master myself, but currently we are using 0.3.0.
We have found that extracting the parts of compound words is highly desirable in the index analysis stage, for several reasons: