pat / thinking-sphinx

Sphinx/Manticore plugin for ActiveRecord/Rails
http://freelancing-gods.com/thinking-sphinx
MIT License
1.63k stars 470 forks source link

Restarting sphinx after deploy with changes to index #1168

Closed atomical closed 3 years ago

atomical commented 4 years ago

Hi,

When we deploy changes to indexes we run these tasks with Capistrano:

rake ts:index
rake ts:restart

However, the searchd process crashes and we have to remove the binlog and start sphinx again. Is there something obviously wrong with running these two rake tasks?

pat commented 4 years ago

Hrm… is this with real-time indices, or SQL-backed indices? I guess I'd expect it to work with SQL-backed indices… but maybe the binlog gets in the way of that with the changes to index structure. Certainly, I'm less sure about it working with real-time indices. 🤔

If you can let me know which versions of Rails, Sphinx, and Thinking Sphinx you're using, I can look at trying to reproduce the issue… or, of course, you're welcome to create a sample app too! :)

atomical commented 4 years ago

SQL-backed indexes. I'll do some more digging and logging. Thanks.

atomical commented 4 years ago

Going over the logs this is what we see when we run restart after an index:

[Wed Nov 14 09:46:39.317 2018] [11243] watchdog: main process 11244 forked ok
[Wed Nov 14 09:46:39.318 2018] [11244] listening on 127.0.0.1:9306
[Wed Nov 14 09:46:39.583 2018] [11244] binlog: replaying log /var/www/shared/config/binlog/binlog.001
[Wed Nov 14 09:46:39.583 2018] [11244] FATAL: binlog: log open error: failed to open /var/www/shared/config/binlog/binlog.001: No such file or directory
[Wed Nov 14 09:46:39.603 2018] [11243] watchdog: main process 11244 exited cleanly (exit code 1), shutting down

Sphinx 2.3.2-id64-beta (4409612)

pat commented 4 years ago

I'm wondering if the best option here is to manually delete the binlog files? Because it seems they're partly being removed anyway… I'm thinking along the lines of:

bundle exec rake ts:index
bundle exec rake ts:stop
rm -rf /var/www/shared/config/binlog/*
bundle exec rake ts:start

Removing the files will be fast, but it does mean they'll be a slight pause from Sphinx stopping to starting again, due to the Rails app booting up to invoke rake. So maybe it's better to replace the stop/start rake calls with direct invocations of the searchd binary instead (adjust with the true path to your Sphinx config file)…

bundle exec rake ts:index
searchd --stopwait --config /var/www/shared/config/production.sphinx.conf
rm -rf /var/www/shared/config/binlog/*
searchd --pidfile --config /var/www/shared/config/production.sphinx.conf
pat commented 3 years ago

Closing this issue as it's been dormant for a few months - but certainly, if the problem is still occurring, please do comment and we can re-open and continue to investigate.