pulibrary / figgy

Valkyrie-based digital repository backend.
Other
35 stars 4 forks source link

PDFs in the OCR folder are not being processed #6403

Closed hackartisan closed 1 month ago

hackartisan commented 1 month ago

User report via lsupport

Sudden priority justification

staff work is blocked delivering documents to users

hackartisan commented 1 month ago

It's true, they aren't, and the monitor is not alerting, even though the watcher correctly indicates that it's down:

irb(main):025:0> FileWatcherStatus.new.check!
/opt/figgy/releases/20240514232304/app/checks/file_watcher_status.rb:11:in `check!': FileWatcherStatus::FileWatcherStatusCheckError (FileWatcherStatus::FileWatcherStatusCheckError)
hackartisan commented 1 month ago

Update: I had a typo, this endpoint is correct, it's just not reporting the failure for some reason even though calling the class directly does give the error as expected (see above)

the provider configured into datadog isn't hitting a valid endpoint

https://github.com/pulibrary/princeton_ansible/pull/4804/files

Not sure what the endpoint should be.

hackartisan commented 1 month ago

https://app.honeybadger.io/projects/53391/faults/107892659

hackartisan commented 1 month ago

Okay it looks like the way the filewatcher gem works was changed; maybe this is the cause of that expected hash got string error. see differences between the previous readme and current readme:

https://github.com/filewatcher/filewatcher/blob/v1.1.1/README.md

watch.do used to yield filename, event but now it yields changes which itself yields filename, event.

The update happened a month or so ago https://github.com/pulibrary/figgy/commit/5c8e9bbcd9ff8610e368e2deeceed3dc65da8d42#diff-89cade48462044ee1b672dc5f4c3ec250fbd29effcd8932096a23c1283c6731fR386

the timing with the honeybadger error starting just yesterday afternoon doesn't make much sense, but still this seems to match the error.

hackartisan commented 1 month ago

rolling back the filewatcher gem stopped the error but you still have to touch all the files that are in there so they're not old.

we did find . -type f -name "*.pdf" -exec touch {} +

hackartisan commented 1 month ago

the files are getting processed now and I let Tracy know; closing

Also, the monitor started working again when we deployed figgy.