Closed maxkadel closed 2 months ago
It seems that the two files @mzelesky flagged never actually made it to the bibdata file share used by the production bibdata environment. There is also incremental delete file incremental_36742489680006421_20240430_180403[049]_delete.tar.gz
/matching the datestamp of the file from the 4/30 that is also missing on the bibdata share at /data/bibdata_files
Investigating this a bit further and it seems starting on 4/30 through at least 5/6 we missed a couple files every day. Looking back to the previous week 4/22-4/29 the files appear to all have made it to the bibdata share.
I'm doing the comparison from the two servers by running the same grep for the date in the filenames stored in each environment.
ls -lrth *20240430* | wc -l
Just noting all files were successfully moved on 5/7.
I looked in AWS logs the file incremental_36742489680006421_20240430_180403[049]_delete.tar.gz
from April 30th that didn't make it to bibdata prod. So far I don't see something that would prevent SQS poller service to create an event and a dump and transfer in the background the file to bibdata and attach it on this event/dump.
{
"e": 1714502953,
"m": "alma.webhook.action",
"t": [
"dd_lambda_layer:datadog-ruby27",
"environment:production",
"action:JOB_END",
"body:{\"id\":\"36742489680006421\",\"action\":\"JOB_END\",\"institution\":{\"value\":\"01PRI_INST\",\"desc\":\"Princeton University Library\"},\"time\":\"2024-04-30T18:49:07.503Z\",\"job_instance\":{\"id\":\"36742489680006421\",\"name\":\"Publishing Platform Job Incremental Publishing\",\"progress\":111.7,\"status\":{\"value\":\"COMPLETED_SUCCESS\",\"desc\":\"Completed Successfully\"},\"external_id\":\"36742491740006421\",\"submitted_by\":{\"value\":\"System\"},\"submit_time\":\"2024-04-30T18:00:10.892Z\",\"start_time\":\"2024-04-30T18:00:22.050Z\",\"end_time\":\"2024-04-30T18:49:07.503Z\",\"status_date\":\"2024-04-30Z\",\"alert\":[{\"value\":\"alert_general_success\",\"desc\":\"The job completed successfully. For more information view the report details.\"}],\"counter\":[{\"type\":{\"value\":\"label.new.records\",\"desc\":\"New Records\"},\"value\":\"54\"},{\"type\":{\"value\":\"label.updated.records\",\"desc\":\"Updated Records\"},\"value\":\"529\"},{\"type\":{\"value\":\"label.deleted.records\",\"desc\":\"Deleted Records\"},\"value\":\"12\"},{\"type\":{\"value\":\"c.jobs.publishing.failed.publishing\",\"desc\":\"Unpublished failed records\"},\"value\":\"0\"},{\"type\":{\"value\":\"c.jobs.publishing.skipped\",\"desc\":\"Skipped records (update date changed but no data change)\"},\"value\":\"224\"},{\"type\":{\"value\":\"c.jobs.publishing.filtered_out\",\"desc\":\"Filtered records (not published due to filter)\"},\"value\":\"0\"},{\"type\":{\"value\":\"c.jobs.publishing.totalRecordsWrittenToFile\",\"desc\":\"Total records written to file\"},\"value\":\"0\"}],\"job_info\":{\"id\":\"S32986800410006421\",\"name\":\"Publishing Platform Job Incremental Publishing\",\"description\":\"Publishing Platform Job\",\"type\":{\"value\":\"SCHEDULED\",\"desc\":\"Scheduled\"},\"category\":{\"value\":\"PUBLISHING\",\"desc\":\"Publishing\"},\"link\":\"/almaws/v1/conf/jobs/S32986800410006421\"},\"link\":\"/almaws/v1/conf/jobs/S32986800410006421/instances/36742489680006421\"}}"
],
"v": 1
}
When I check the last six days of file transfers the count for everyday comparing lib-sftp-prod1 and the /data/bibdata_files share in prod bibdata checks out. I think fully deactivating the old prod app servers has taken care of this issue. We'll have to run a full re-index to get everything caught up.
Expected behavior
Bibdata consistently indexes files picked up from Alma.
Actual behavior
Report from catalogers that two records updated on April 30th were not indexed. Upon investigation, the two IDs (9984008483506421, 9984007603506421 ) were included in Alma publishing files, but were not updated in Solr.
Files that the records were included in:
Impact of this bug
Catalog index is out of sync with Alma.
Implementation notes, if any