pat / thinking-sphinx

Sphinx/Manticore plugin for ActiveRecord/Rails
http://freelancing-gods.com/thinking-sphinx
MIT License
1.63k stars 469 forks source link

How to remove records on update with real-time indexes #1253

Open JasonBarnabe opened 10 months ago

JasonBarnabe commented 10 months ago

With a real-time index, I can provide a scope to filter out records I don't want indexed:

This allows eager loading of associations, or even filtering out specific values. However, keep in mind the default callbacks don’t use this scope, so a record that does not get included in this scope but is then altered will be added to your Sphinx data.

and

For real-time indices you can define a custom scope to preload associations or apply custom conditions:

scope { Article.includes(:comments) }

This scope only comes into play when populating all records at once, not when single records are created or updated.

So I need separate logic on update to filter out things I don't want indexed. Closest I can find is this info in callbacks...

If you wish to have your callbacks update Sphinx only in certain conditions, you can either define your own callback and then invoke TS if/when needed:

after_save :populate_to_sphinx

# ...

def populate_to_sphinx
  return unless indexing?

  ThinkingSphinx::RealTime::Callbacks::RealTimeCallbacks.new(
    :article
  ).after_save self
end

Or supply a block to the callback instantiation which returns an array of instances to process:

# if your model is app/models/article.rb:
ThinkingSphinx::Callbacks.append(self, :behaviours => [:real_time]) { |instance|
  instance.indexing? ? [instance] : []
}

However this is changing what records are updated in Sphinx and not what records are in Sphinx. Specifically, if a record previously had indexing? return true, but now after an update returns false, the logic provided just means that no update will be sent to Sphinx (and it will remain in Sphinx), not that the record will be removed in Sphinx.

How do I remove a record from Sphinx on an update with a callback? Or am I thinking about this wrong?

akostadinov commented 6 months ago

This seems related to #1216 You can also check the discussion under #1215

Please check #1258 and see if it resolves the issue for you.

pat commented 1 month ago

As per @akostadinov's great work, this should now be covered by ThinkingSphinx::Processor.new(instance: self).sync - it'll update records if they're still within the scope, or otherwise delete them.

pat commented 1 month ago

… and this is part of v5.6.0 which has just been released!