ankane / searchkick

Intelligent search made easy
MIT License
6.43k stars 748 forks source link

Attempting a full reindex can intermittently return an 'index_not_found_exception' #1680

Closed CarterBland closed 1 month ago

CarterBland commented 1 month ago

Describe the bug There is an extremely odd issue where full reindex of models can cause index_not_found exceptions from Opensearch. The strange part is that I can manually pull the index and status via the Opensearch rest API, and see that it is all configured correctly.

For example,

def reindex_model(klass, retries: 0)
  puts "reindexing class: #{klass}"
  klass.reindex(resume: retries > 0)
rescue => e
  raise "Retries exceed" if retries > 5
  puts e
  sleep 5
  puts OpensearchClient.get("/_cat/indices/#{klass.search_index.name}")
  puts OpensearchClient.get("/#{klass.search_index.name}")
  klass.search_index.refresh
  sleep 5
  reindex_model(klass, retries: retries + 1)
end

reindex_model(Feature)

This returned extremely odd logs, where it looped multiple times with

exception

{"type"=>"index_not_found_exception", "reason"=>"no such index [features_staging_20240524171908943]", "index"=>"features_staging_20240524171908943", "index_uuid"=>"jNa6wzZ3QbmmCrs0q31yiA"} on item with id '685140'

though Opensearch returned the status of the index as

green open features_staging_20240524171908943 jNa6wzZ3QbmmCrs0q31yiA 1 1 0 0 416b 208b

It eventually goes through and the status becomes

green open features_staging_20240524171908943 jNa6wzZ3QbmmCrs0q31yiA 1 1 336 0 4.5mb 2.2mb

Which has no seemingly no difference aside the document count and size increases. Screen Shot 2024-05-24 at 12 39 36 PM

Additional context

CarterBland commented 1 month ago

As I continue to debug this, I have a feeling this is actually an Opensearch issue. I confirmed the index exists by adding a request before the Indexer class sends it's first bulk request to assert the index exists, it comes back good and then still fails.

Screen Shot 2024-05-24 at 2 46 22 PM

CarterBland commented 3 weeks ago

For posterity, it was a specific issue with using neural search and the version of Opensearch we were using (the latest at the time). The cut a new release that solves this issue. See this thread for context