pat / thinking-sphinx

Sphinx/Manticore plugin for ActiveRecord/Rails
http://freelancing-gods.com/thinking-sphinx
MIT License
1.63k stars 468 forks source link

5.2.1 started failing with "Record IDs found by Sphinx but not by ActiveRecord : 18, 19, 20, 21, 22" #1202

Closed thebravoman closed 2 years ago

thebravoman commented 3 years ago

On August 11 we've migrated from 5.2.0 to 5.2.1

diff --git a/Gemfile.lock b/Gemfile.lock
index 51d8b489f..a817fd589 100644
--- a/Gemfile.lock
+++ b/Gemfile.lock
@@ -1143,7 +1143,7 @@ GEM
       unicode-display_width (>= 1.1.1, < 3)
     terrapin (0.6.0)
       climate_control (>= 0.0.3, < 1.0)
-    thinking-sphinx (5.2.0)
+    thinking-sphinx (5.2.1)
       activerecord (>= 4.2.0)
       builder (>= 2.1.2)
       innertube (>= 1.0.2)

Specs started failing with "Record IDs found by Sphinx but not by ActiveRecord : 18, 19, 20, 21, 22"

The spec is basically

scenario "...explanation..." do
    record.update(title: "Record title")

    visit "/record/#{record.position}"

    expect(page).to have_text "Record title"
  end

They don't fail locally on my machine but only on a central machine that builds the project.

thebravoman commented 3 years ago

Manage to resolve it with

$ history
 2010  bin/rspec gems/fc_content/spec/system/episodes_spec.rb
 2011  rake sphinx
 2012  bin/rake -tasks
 2013  bin/rake --tasks
 2014  bin/rake ts:rebuild
 2015  RAILS_ENV=test bin/rake ts:rebuild
 2016  less /var/lib/project/workspace/log/test.searchd.log
 2017  cat /var/lib/project/workspace/log/test.searchd.log 
# The cat showed. I thought probably a service is still running
[Tue Aug 17 12:57:47.533 2021] [3946043] watchdog: main process 3946044 forked ok
[Tue Aug 17 12:57:47.541 2021] [3946044] listening on 127.0.0.1:9307
[Tue Aug 17 12:57:47.541 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:57:50.545 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:57:53.547 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:57:56.549 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:57:59.551 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:02.552 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:05.555 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:08.558 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:11.559 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:14.562 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:17.565 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:20.567 2021] [3946044] bind() failed on 127.0.0.1, retrying...
[Tue Aug 17 12:58:23.568 2021] [3946044] FATAL: bind() failed on 127.0.0.1: Address already in use
[Tue Aug 17 12:58:23.571 2021] [3946043] watchdog: main process 3946044 exited cleanly (exit code 1), shutting down
 2018  netstat -apt | grep 9307
 2019  kill -9 2917568
 2020  RAILS_ENV=test bin/rake ts:rebuild
 2021  bin/rspec gems/fc_content/spec/system/episodes_spec.rb

I am not sure how much of an issue this is, but the message "Records IDs found by Sphinx but not by ActiveRecord" had to do with the fact that searchd was still running, probably from a previous run and probably over a different set of records. Probably the message could be further improved.

Rebuilding fixed the problem for now.

pat commented 3 years ago

Thanks for reporting this :) And good to know you found a resolution!

I've just added a section to the docs, and also changed the exception message to link there as well: https://freelancing-gods.com/thinking-sphinx/v5/common_issues.html#record-ids

thebravoman commented 3 years ago

I did find a solution but it is happening every time. I haven't had a successful spec run for a few days. I don't whether my solution is a solution.

Update

/usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/sid.rb:39:in `getpgid': No such process (Errno::ESRCH)
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/sid.rb:39:in `pgid'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/server.rb:84:in `set_pgid'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/server.rb:39:in `boot'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/server.rb:14:in `boot'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/client/server.rb:10:in `call'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/client/command.rb:7:in `call'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/lib/spring/client.rb:30:in `run'
    from /usr/local/rvm/gems/ruby-2.6.5/gems/spring-2.1.1/bin/spring:49:in `<main>'
Build step 'Execute shell' marked build as failure
[lockable-resources] released lock on [postgresql_fllcasts_test]
thebravoman commented 3 years ago

The spec that is causing this is

  before do
    login_as user

    FactoryBot.create(:content_picture_ref, content: program_with_pic, content_picture: content_picture,
                                                   content_picture_type: :thumbnail)

    SphinxHelpers.start
    SphinxHelpers.index
  end

After it is ran a couple of times, the above error starts to occur

pat commented 3 years ago

🤔 Given this seems to have started with 5.2.1 - have you tried 5.3.0 to see if that resolves the issue?

And if it doesn't: which versions of Rails and Ruby are you using, and is it with real-time indices or SQL-backed indices? Also: I can see you're starting and indexing Sphinx in your test, but are you stopping it anywhere?

thebravoman commented 3 years ago

I migrated to 5.3.0. Waiting for a couple of days to see how it works. For now it fails

Record IDs found by Sphinx but not by ActiveRecord : 3480, 3481, 3482, 3483, 3484

Rails version - 6.1.4 Ruby version - 2.6.5

I am using ThinkingSphinx::Test.start_with_autostop for stopping. Helpers are

module SphinxHelpers
  def self.start
    # Ensure sphinx directories exist for the test environment
    ThinkingSphinx::Test.init

    # Configure and start Sphinx, and automatically
    # stop Sphinx at the end of the test suite.
    ThinkingSphinx::Test.start_with_autostop
  end

  def self.index
    ThinkingSphinx::Test.index

    # Wait for Sphinx to finish loading in the new index files.
    sleep 0.25 until SphinxHelpers::index_finished?
  end

  def self.index_finished?
    Dir[Rails.root.join(ThinkingSphinx::Test.config.indices_location, '*.{new,tmp}*')].empty?
  end
end

I am not sure if I am using real-time indices or SQL-backed indices. How could I check?

thebravoman commented 3 years ago

I migrated to 5.3.0.

For the last 10 builds non of them failed with this error.

pat commented 2 years ago

@thebravoman hoping this error hasn't shown up again since the upgrade of TS versions? :)

thebravoman commented 2 years ago

Thanks for checking out. No, it has not.

But some specs using sphinx are still failing on a random bases after .index but I haven't got the time to debug them.

On Tue, Oct 5, 2021, 03:05 Pat Allan @.***> wrote:

@thebravoman https://github.com/thebravoman hoping this error hasn't shown up again since the upgrade of TS versions? :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pat/thinking-sphinx/issues/1202#issuecomment-933948639, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAGBM47LT6CZ4VUY6V6RTUFI6N7ANCNFSM5CJP6M6A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

pat commented 2 years ago

Ah, sorry to hear there's still some problems! Sounds like that might be something different, so I'll close this issue - but when you do have time to debug, feel free to open a new one, and we'll see if we can figure out the fix!