pulibrary / bibdata

Local API for retrieving bibliographic and other useful data from Alma (Ruby 3.2.0, Rails 7.1.3.4)
BSD 2-Clause "Simplified" License
16 stars 7 forks source link

Events are not created for existing alma publishing jobs and files #2450

Closed christinach closed 3 weeks ago

christinach commented 3 weeks ago

Expected behavior

Events should be created every two hours.

Actual behavior

Even though there is a successful publishing job in Alma, events are missing from bibdata.

Steps to replicate

See https://bibdata.princeton.edu/events

Impact of this bug

Missing files. Not all changed records are indexed.

Honeybadger link and code snippet, if applicable

Related error: https://app.honeybadger.io/projects/54497/faults/109878060

Implementation notes, if any

After upgrading bibdata to 7.1 we started seeing duplicate events. We updated the postgres schema.rb and the model validation :

I've checked:

christinach commented 3 weeks ago

checking journalctl bibdata-sqs-poller service: journalctl -u bibdata-sqs-poller -b | grep 'ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken' From sqs-poller the error is triggered every 30 secs.

Aug 19 10:09:16 bibdata-worker-prod1 sqspoller[2867048]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
Aug 19 10:09:46 bibdata-worker-prod1 sqspoller[2867209]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
Aug 19 10:10:16 bibdata-worker-prod1 sqspoller[2867368]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
Aug 19 10:10:46 bibdata-worker-prod1 sqspoller[2867541]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
Aug 19 10:11:16 bibdata-worker-prod1 sqspoller[2867711]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
Aug 19 10:11:46 bibdata-worker-prod1 sqspoller[2867862]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
Aug 19 10:12:16 bibdata-worker-prod1 sqspoller[2868036]: ActiveRecord::RecordInvalid: Validation failed: Message body has already been taken
christinach commented 3 weeks ago

The events and files finally came and indexed. I’m not sure yet if it was because I restarted the sqs poller service or if it is because of a delay in the aws sqs queue (a delay can also happen for different reasons according to the AWS documentation). I do see that there are 34 messages in the aws dead letter queue but I don’t know yet if I should ‘redrive’ them to the production queue (I want to make sure first that it doesn’t add to any extra cost. I read a reference that it adds a cost if the messages are > than a specific number of messages per month.) Closing this ticket in favor of a new one where we document how to access these dead letter queue messages from the AWS console before sending them to another queue.

christinach commented 3 weeks ago

Closing this ticket in favor of a new one (..coming soon) where we add documentation on how to access these AWS dead letter queue messages and how to troubleshoot the AWS sqs queue. See AWS dead letter queue docs