pat / thinking-sphinx

Sphinx/Manticore plugin for ActiveRecord/Rails
http://freelancing-gods.com/thinking-sphinx
MIT License
1.63k stars 468 forks source link

Is the delete delta job supposed to fire on an update? #1184

Closed JasonBarnabe closed 3 years ago

JasonBarnabe commented 3 years ago

When I do an update in production, I see 3 ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJobs happening. (There are no database deletes involved in the update action.) Is this normal?

Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.276Z pid=17063 tid=gmy56l91n class=ThinkingSphinx::Deltas::SidekiqDelta::DeltaJob jid=4ea4d87fa3c6041b1d25f78a INFO: start
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.276Z pid=17063 tid=gmy56l8sj class=ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJob jid=aafac368b13da2612fc52e3f INFO: start
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.278Z pid=17063 tid=gmy56l91n class=ThinkingSphinx::Deltas::SidekiqDelta::DeltaJob jid=4ea4d87fa3c6041b1d25f78a elapsed=0.002 INFO: done
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.281Z pid=17063 tid=gmy56l8sj class=ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJob jid=aafac368b13da2612fc52e3f elapsed=0.005 INFO: done
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.296Z pid=17063 tid=gmy56l91n class=ThinkingSphinx::Deltas::SidekiqDelta::DeltaJob jid=0e3f7e36c9052967c6fa9071 INFO: start
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.298Z pid=17063 tid=gmy56l91n class=ThinkingSphinx::Deltas::SidekiqDelta::DeltaJob jid=0e3f7e36c9052967c6fa9071 elapsed=0.002 INFO: done
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.300Z pid=17063 tid=gmy56l8sj class=ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJob jid=85cd445fa75bb937a7f1f09c INFO: start
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.303Z pid=17063 tid=gmy56l8sj class=ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJob jid=85cd445fa75bb937a7f1f09c elapsed=0.003 INFO: done
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.321Z pid=17063 tid=gmy56l91n class=ThinkingSphinx::Deltas::SidekiqDelta::DeltaJob jid=c7656bd3682b6b237d95cb0e INFO: start
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.323Z pid=17063 tid=gmy56l91n class=ThinkingSphinx::Deltas::SidekiqDelta::DeltaJob jid=c7656bd3682b6b237d95cb0e elapsed=0.002 INFO: done
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.324Z pid=17063 tid=gmy56l8sj class=ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJob jid=8247d27ef7b3814bfdf74db9 INFO: start
Nov 24 02:51:47 workers sidekiq[17063]: 2020-11-24T02:51:47.327Z pid=17063 tid=gmy56l8sj class=ThinkingSphinx::Deltas::SidekiqDelta::FlagAsDeletedJob jid=8247d27ef7b3814bfdf74db9 elapsed=0.003 INFO: done

I've created a test that demonstrates it - https://github.com/JasonBarnabe/greasyfork/pull/821/files

pat commented 3 years ago

Hi Jason - I'm not quite sure why there's three - this could possibly be because of the association callbacks? - but there should certainly be at least one, marking the updated record as 'deleted' in the core index only, to ensure only data from the delta index is considered for search requests from that point onwards (until the next full re-index, which resets the deleted flags).

JasonBarnabe commented 3 years ago

I'm not quite sure why there's three - this could possibly be because of the association callbacks?

From the test, looks like it's indeed coming from the associated objects. Is there a possibility of something going wrong (other than just wasting time) if things are fired multiple times? Some sort of race condition?

there should certainly be at least one

Good to know - I was having a problem in production and I thought this might be the cause. But I later found a delta.tmp file hanging around, and once I deleted that, things started working again.

pat commented 3 years ago

Is there a possibility of something going wrong (other than just wasting time) if things are fired multiple times? Some sort of race condition?

I wouldn't think so. They're idempotent actions - they just mark a single record as deleted (from the perspective of Thinking Sphinx - the data remains in Sphinx, it's just filtered out in search requests). If it's already flagged as deleted, it'll stay that way. The only way that value changes back is when the index is reprocessed via ts:index.

JasonBarnabe commented 3 years ago

Thanks!