sul-dlss / searchworks_traject_indexer

indexing MARC, MODS, and more for SearchWorks
Other
6 stars 1 forks source link

Guard against empty cocina timestamps #1512

Closed thatbudakguy closed 3 weeks ago

thatbudakguy commented 1 month ago

Some of the logic in the geo indexing config uses a series of fallbacks to generate a "modified date" for the metadata itself: https://github.com/sul-dlss/searchworks_traject_indexer/blob/bbb0e6c0d98d829824946925335e3f2a6ccb797c/lib/traject/config/geo_aardvark_config.rb#L260-L265

Releasing the object cb601kb6593 from argo-stage will throw the following error in the event log:

{
  "host": "sw-indexing-stage-a.stanford.edu",
  "target": "Earthworks",
  "context": {
    "record": "<record #1 (/dev/null #1), output_id:stanford-cb601kb6593>",
    "index_step": "(to_field \"gbl_mdModified_dt\" at ./lib/traject/config/geo_aardvark_config.rb:264)"
  },
  "message": "no implicit conversion of nil into String",
  "invoked_by": "indexer"
}
thatbudakguy commented 4 weeks ago

I've republished many staging objects that trigger this, but some don't appear to be generating new cocina and will still throw errors, e.g. km361zq1607.

jcoyne commented 3 weeks ago

@thatbudakguy In the event log for that item, we see "Publish request received" but never "Publish complete", so it' never republished with good data.

thatbudakguy commented 3 weeks ago

I saw that too. Any idea why?

jcoyne commented 3 weeks ago

@thatbudakguy I've asked @lwrubel to take a look.

lwrubel commented 3 weeks ago

I think this can be closed, based on what we learned about the version needing to be closed for these particular objects?