pulibrary / bibdata

Local API for retrieving bibliographic and other useful data from Alma (Ruby 3.1.0, Rails 7.0)
BSD 2-Clause "Simplified" License
16 stars 7 forks source link

Skipped Alma Incremental(s) #2368

Closed maxkadel closed 2 months ago

maxkadel commented 2 months ago

Expected behavior

Bibdata consistently indexes files picked up from Alma.

Actual behavior

Report from catalogers that two records updated on April 30th were not indexed. Upon investigation, the two IDs (9984008483506421, 9984007603506421 ) were included in Alma publishing files, but were not updated in Solr.

Files that the records were included in:

incremental_36756664760006421_20240501_090509[019]_new # Mark says that none of these were indexed
incremental_36742489680006421_20240430_180403[049]_new

Impact of this bug

Catalog index is out of sync with Alma.

Implementation notes, if any

kevinreiss commented 2 months ago

It seems that the two files @mzelesky flagged never actually made it to the bibdata file share used by the production bibdata environment. There is also incremental delete file incremental_36742489680006421_20240430_180403[049]_delete.tar.gz /matching the datestamp of the file from the 4/30 that is also missing on the bibdata share at /data/bibdata_files

kevinreiss commented 2 months ago

Investigating this a bit further and it seems starting on 4/30 through at least 5/6 we missed a couple files every day. Looking back to the previous week 4/22-4/29 the files appear to all have made it to the bibdata share.

kevinreiss commented 2 months ago

I'm doing the comparison from the two servers by running the same grep for the date in the filenames stored in each environment.

ls -lrth *20240430* | wc -l

kevinreiss commented 2 months ago

Just noting all files were successfully moved on 5/7.

christinach commented 2 months ago

I looked in AWS logs the file incremental_36742489680006421_20240430_180403[049]_delete.tar.gz from April 30th that didn't make it to bibdata prod. So far I don't see something that would prevent SQS poller service to create an event and a dump and transfer in the background the file to bibdata and attach it on this event/dump.

{
    "e": 1714502953,
    "m": "alma.webhook.action",
    "t": [
        "dd_lambda_layer:datadog-ruby27",
        "environment:production",
        "action:JOB_END",
        "body:{\"id\":\"36742489680006421\",\"action\":\"JOB_END\",\"institution\":{\"value\":\"01PRI_INST\",\"desc\":\"Princeton University Library\"},\"time\":\"2024-04-30T18:49:07.503Z\",\"job_instance\":{\"id\":\"36742489680006421\",\"name\":\"Publishing Platform Job Incremental Publishing\",\"progress\":111.7,\"status\":{\"value\":\"COMPLETED_SUCCESS\",\"desc\":\"Completed Successfully\"},\"external_id\":\"36742491740006421\",\"submitted_by\":{\"value\":\"System\"},\"submit_time\":\"2024-04-30T18:00:10.892Z\",\"start_time\":\"2024-04-30T18:00:22.050Z\",\"end_time\":\"2024-04-30T18:49:07.503Z\",\"status_date\":\"2024-04-30Z\",\"alert\":[{\"value\":\"alert_general_success\",\"desc\":\"The job completed successfully. For more information view the report details.\"}],\"counter\":[{\"type\":{\"value\":\"label.new.records\",\"desc\":\"New Records\"},\"value\":\"54\"},{\"type\":{\"value\":\"label.updated.records\",\"desc\":\"Updated Records\"},\"value\":\"529\"},{\"type\":{\"value\":\"label.deleted.records\",\"desc\":\"Deleted Records\"},\"value\":\"12\"},{\"type\":{\"value\":\"c.jobs.publishing.failed.publishing\",\"desc\":\"Unpublished failed records\"},\"value\":\"0\"},{\"type\":{\"value\":\"c.jobs.publishing.skipped\",\"desc\":\"Skipped records (update date changed but no data change)\"},\"value\":\"224\"},{\"type\":{\"value\":\"c.jobs.publishing.filtered_out\",\"desc\":\"Filtered records (not published due to filter)\"},\"value\":\"0\"},{\"type\":{\"value\":\"c.jobs.publishing.totalRecordsWrittenToFile\",\"desc\":\"Total records written to file\"},\"value\":\"0\"}],\"job_info\":{\"id\":\"S32986800410006421\",\"name\":\"Publishing Platform Job Incremental Publishing\",\"description\":\"Publishing Platform Job\",\"type\":{\"value\":\"SCHEDULED\",\"desc\":\"Scheduled\"},\"category\":{\"value\":\"PUBLISHING\",\"desc\":\"Publishing\"},\"link\":\"/almaws/v1/conf/jobs/S32986800410006421\"},\"link\":\"/almaws/v1/conf/jobs/S32986800410006421/instances/36742489680006421\"}}"
    ],
    "v": 1
}
kevinreiss commented 2 months ago

When I check the last six days of file transfers the count for everyday comparing lib-sftp-prod1 and the /data/bibdata_files share in prod bibdata checks out. I think fully deactivating the old prod app servers has taken care of this issue. We'll have to run a full re-index to get everything caught up.