Closed pat closed 3 years ago
Hello, as a matter of upgrading. I applied the real_time_tidy
option to an existing server. It properly generated the rt_attr_uint = sphinx_updated_at
lines in the resulting sphinx.conf
file.
But then trying to reindex the system, it fails with:
$ rake ts:index
Removing :environment prerequisite from db:create
Generating index files for account_core
rake aborted!
ThinkingSphinx::SphinxError: unknown column: 'sphinx_updated_at' - REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (36, 'Provider Name', '2', 'admin 3scaleadmin', ' 3scale Admin', 'admin@provider.example.com 3scaleadmin+provider-admin.3scale.localhost@3scale.net', '570a3e1fbaab0446bab40a36abf8bd33', 'a1c74f5e', 'Provider Name\'s App', '2 3', 2, 'Account', 0, 1681142319, 1, 2, 'approved'), (54, 'Developer', '3', 'john', 'John Doe', 'admin+test@provider.example.com', '1735cf5c6d2bca70d2c719fbc132ce75', 'bb1df627', 'Developer\'s App', '4', 3, 'Account', 0, 1681142319, 2, 2, 'approved')
Caused by:
Innertube::Pool::BadResource: Innertube::Pool::BadResource
Caused by:
ThinkingSphinx::QueryExecutionError: unknown column: 'sphinx_updated_at' - REPLACE INTO account_core (id, `name`, `account_id`, `username`, `user_full_name`, `email`, `user_key`, `app_id`, `app_name`, `user_id`, `sphinx_internal_id`, `sphinx_internal_class`, `sphinx_deleted`, `sphinx_updated_at`, `provider_account_id`, `tenant_id`, `state`) VALUES (36, 'Provider Name', '2', 'admin 3scaleadmin', ' 3scale Admin', 'admin@provider.example.com 3scaleadmin+provider-admin.3scale.localhost@3scale.net', '570a3e1fbaab0446bab40a36abf8bd33', 'a1c74f5e', 'Provider Name\'s App', '2 3', 2, 'Account', 0, 1681142319, 1, 2, 'approved'), (54, 'Developer', '3', 'john', 'John Doe', 'admin+test@provider.example.com', '1735cf5c6d2bca70d2c719fbc132ce75', 'bb1df627', 'Developer\'s App', '4', 3, 'Account', 0, 1681142319, 2, 2, 'approved')
Caused by:
Mysql2::Error: unknown column: 'sphinx_updated_at'
Tasks: TOP => ts:index => ts:rt:index
(See full trace by running task with --trace)
What is the suggested upgrade path for this option? Is it possible to upgrade without clearing the indices first?
This is the index file:
index account_core
{
type = rt
path = /home/avalon/ws/repos/porta/db/sphinx/development/account_core
charset_table = 0..9, A..Z->a..z, a..z, U+23, U+25, U+27, U+2A..U+2C, U+2E, U+3A, U+3B, U+3F, U+5F, U+60, U+7B, U+7D
min_infix_len = 3
html_strip = 1
html_index_attrs = img=alt,title; a=title
html_remove_elements = style, script
rt_field = name
rt_field = account_id
rt_field = username
rt_field = user_full_name
rt_field = email
rt_field = user_key
rt_field = app_id
rt_field = app_name
rt_field = user_id
rt_attr_uint = sphinx_deleted
rt_attr_uint = sphinx_updated_at
rt_attr_bigint = sphinx_internal_id
rt_attr_bigint = provider_account_id
rt_attr_bigint = tenant_id
rt_attr_string = sphinx_internal_class
rt_attr_string = state
}
It seems like the index needs to be wiped :( https://sphinxsearch.com/forum/view.html?id=11061
If you know any other way, please let me know.
This PR adds a new internal attribute -
sphinx_updated_at
- to real-time indices, and sets that with the current timestamp whenever that record is inserted or updated. And then, when a reindex happens, any record that has not changed since the reindexing began is removed - as it's not in the known dataset. Likely, it was deleted but the callbacks failed for some reason.The advantage of this is that running
ts:index
will remove any records that are no longer indexed - thus keeping index sizes to what they should be, avoiding any bloat that may crop up from situations where records haven't been removed (bulk deletions, for example) - and this is done without the need for deleting all records to begin with (the behaviour ofts:rebuild
).This is currently only enabled with
real_time_tidy
is set to true for the appropriate environment inconfig/thinking_sphinx.yml
. I see it defaulting to true in the near future, but that might not be until v6.0.