pulibrary / dpul-collections

An inspiring environment for global communities to engage with diverse digital collections
1 stars 0 forks source link

Track down missing EphemeraFolders items in dpul-c #268

Open eliotjordan opened 1 week ago

eliotjordan commented 1 week ago

The numer of EphemeraFolders in solr do not match the number of open and complete EphemeraFolders in Figgy.

Items in Figgy from UI: 56,164 Items in Figgy from query: 56,174 HydrationCache: 56,105 TransformationCache: 56,105 Items in Solr: 55,651

tpendragon commented 1 week ago

@eliotjordan Do the number of hydration cache entries match?

eliotjordan commented 1 week ago

@tpendragon I'm working on that number now. The number of transformation cache entries does not match, though. I'll post updated stats in a sec.

eliotjordan commented 6 days ago

Ids in transformation cache that are missing in solr index: missing_solr_ids.csv

Loggin of solr responses in dpul-c show that items with blank titles cause errors in the solr slug generator script. Theory is that the whole batch of solr records is not indexed if it contains a record without a title.

From Graphana:

2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F     "msg":"Unable to invoke function processAdd in script: slug.js: TypeError: null has no such function \"replace\" in <eval> at line number 9", |  
-- | -- | --
  |   | 2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F       "root-error-class","org.openjdk.nashorn.internal.runtime.ECMAException"], |  
  |   | 2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F       "error-class","org.apache.solr.common.SolrException", |  
  |   | 2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F     "metadata":[ |  
  |   | 2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F   "error":{ |  
  |   | 2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F     "QTime":52}, |  
  |   | 2024-11-22 13:36:14.420 | 2024-11-22T13:36:14.137187548-06:00 stdout F     "status":500, |  
eliotjordan commented 6 days ago

Grafana logs for Solr records that raised an error:

Explore-logs-A-data-2024-11-22 17_12_58.csv

eliotjordan commented 2 days ago

EphemeraFolder IDs in Figgy that are missing from the Hydration Cache:

missing-folders.csv