Open mhkhung opened 5 months ago
For what it's worth, here's the patch I came down from v4.2.12. Not a pro of ElasticSearch here, just copied everything from the current docs and it got my server (kind of) working.
BTW I'm using it in my mastodon devops setup, link for anyone who's interested.
diff --git a/app/chewy/accounts_index.rb b/app/chewy/accounts_index.rb
--- a/app/chewy/accounts_index.rb
+++ b/app/chewy/accounts_index.rb
@@ -23,7 +23,7 @@ class AccountsIndex < Chewy::Index
analyzer: {
natural: {
- tokenizer: 'standard',
+ tokenizer: 'ik_max_word',
filter: %w(
lowercase
asciifolding
@@ -36,7 +36,7 @@ class AccountsIndex < Chewy::Index
},
verbatim: {
- tokenizer: 'standard',
+ tokenizer: 'ik_max_word',
filter: %w(lowercase asciifolding cjk_width),
},
diff --git a/app/chewy/statuses_index.rb b/app/chewy/statuses_index.rb
--- a/app/chewy/statuses_index.rb
+++ b/app/chewy/statuses_index.rb
@@ -21,14 +21,23 @@ class StatusesIndex < Chewy::Index
},
},
+ char_filter: {
+ tsconvert: {
+ type: 'stconvert',
+ keep_both: false,
+ delimiter: '#',
+ convert_type: 't2s',
+ },
+ },
+
analyzer: {
verbatim: {
- tokenizer: 'uax_url_email',
+ tokenizer: 'ik_max_word',
filter: %w(lowercase),
},
content: {
- tokenizer: 'standard',
+ tokenizer: 'ik_max_word',
filter: %w(
lowercase
asciifolding
@@ -38,6 +47,7 @@ class StatusesIndex < Chewy::Index
english_stop
english_stemmer
),
+ char_filter: %w(tsconvert),
},
hashtag: {
diff --git a/app/chewy/tags_index.rb b/app/chewy/tags_index.rb
--- a/app/chewy/tags_index.rb
+++ b/app/chewy/tags_index.rb
@@ -4,15 +4,25 @@ class TagsIndex < Chewy::Index
include DatetimeClampingConcern
settings index: index_preset(refresh_interval: '30s'), analysis: {
+ char_filter: {
+ tsconvert: {
+ type: 'stconvert',
+ keep_both: false,
+ delimiter: '#',
+ convert_type: 't2s',
+ },
+ },
+
analyzer: {
content: {
- tokenizer: 'keyword',
+ tokenizer: 'ik_max_word',
filter: %w(
word_delimiter_graph
lowercase
asciifolding
cjk_width
),
+ char_filter: %w(tsconvert),
},
edge_ngram: {
Steps to reproduce the problem
Try to follow the setup here: https://docs.joinmastodon.org/admin/elasticsearch/#search-optimization-for-other-languages
The current code does not match the diff.
Expected behaviour
Docs can be followed
Actual behaviour
Diff no longer valid
Detailed description
The diff is no longer current. It's unclear how this can be fixed and to fix existing indexes. Also, code-level patch is very undesired for administrators - does the patch need to be there all the time or just when the index is created? I do not want to maintain a fork of the code with all the recent security issues - can't this be handled with code/config?
Mastodon instance
No response
Mastodon version
main-latest
Technical details
If this is happening on your own Mastodon server, please fill out those:
ruby --version
, eg. v3.1.2)node --version
, eg. v18.16.0)