Closed Finnstax closed 2 years ago
Hi @Finnstax, thanks for opening the issue, I am able to reproduce it locally.
In my tests, if I keep IoT, Internet Of Things
as a synonym set and change the post title to simply IoT
I still get the same error. Removing the spaces in Internet of Things
makes it work.
That said, it seems you found an edge case. We are currently using both word_delimiter_graph
and synonym_graph
during index time, and that is not fully supported. We could avoid using synonym_graph
during sync time by setting a default_search
analyzer as discussed in the past. We might get back to that approach.
While we discuss it, do you mind adding the following snippet to your codebase and checking if that fixes the problem?
add_filter(
'ep_post_mapping',
function( $mapping ) {
try {
$mapping['settings']['analysis']['filter']['ewp_word_delimiter']['split_on_case_change'] = false;
} catch ( \Throwable $th ) {
/* noop */
}
return $mapping;
}
);
Thanks!
Hey @felipeelia - thank you so much for your reply.
Adding the snippet does unfortunately nothing. I still get the same results as described above.
Is there anything else I could try? Or should we just avoid using synonyms with spaces for now?
Sorry @Finnstax, I should have mentioned that you'll need to send the mapping again. Can you please run wp elasticpress index --setup --yes --show-errors
? This will delete your index, send the correct mapping, and sync everything again. Thanks!
Hey @felipeelia, thank you. That worked flawlessly.
The synonyms with spaces are now also getting indexed and are no longer throwing errors.
I noticed that the synonyms (unlike the other search results) are case sensitive - is there a way to fix this too?
Hey @Finnstax, glad to read it worked! Have you already tried to set the synonyms in lower case? I think ElasticPress already indexes all the content in lowercase, so that should work. If that does not work, you can also try to lower case the search term sent by users too.
I'm going ahead and closing this one out in favor of #2877.
Describe the bug If I add a synonym set containing a space. no post containing that set will be indexed
The post with id 845 has the post_title 'Internet of Things (IoT)' which I will use to demonstrate this effect.
=> No synonym set is added via the dashboard ➡️ the post is getting indexed
=> If i add the following synonym set via the dashboard
➡️ the post (and every post containing the synonym, so each blog post with IoT or Internet of Things) will throw an error on indexing and can not be found
=> If i add the following synonym set via the dashboard
➡️ the post is getting index
Environment information
Where do you host your Elasticsearch server? elastic.co
We're also using the WPML Integration for ElasticPress - does it have anything to do with the tokenizer? Any guidance in the right direction is highly appreciated!