EOL / tramea

A lightweight server for denormalized EOL data
Other
2 stars 1 forks source link

[HOLD] Missing Media #344

Open JRice opened 7 years ago

JRice commented 7 years ago

There are examples of data objects which are not indexed in Solr.

I suggest we write a script something like:

solr = SolrCore::DataObjects.new
missing_ids 
DataObjects.select(:id).published.find_in_batches do |batch|
  ids = batch.map(&:id)
  found_ids = solr.query("data_object_id:#{ids.join(" OR data_object_id:")}"["response"]["docs"].map { |d| d["data_object_id"] }
  missing_ids += ids - found_ids
end
DataObject::Indexer.by_data_object_ids(missing_ids)

(with logging and error checking), and run that to ensure that all data objects are in Solr.

...But we can't do this until we've fixed the problem with TaxonConceptsFlattened... which I'm working on in a separate ticket. :S