sciencehistory / scihist_digicoll

Science History Institute Digital Collections
Other
11 stars 0 forks source link

`No such file or directory - exiftool` #2518

Closed honeybadger[bot] closed 8 months ago

honeybadger[bot] commented 8 months ago

https://digital.sciencehistory.org/admin/asset_files/r1zgfze couldn't be characterized; It appears exiftool is missing in prod.

job_class: Kithe::AssetPromoteJob
job_id: 75690991-7d7f-4366-81b0-52a3a2bc76e4
provider_job_id: 
queue_name: default
priority: 
arguments:
- AssetUploader::Attacher
- Asset
- 718e0267-4bcf-4598-b1f9-4e584d00ff7a
- file
- id: 542b9670d77899f59b87515641a5bb0d.mp4
  storage: cache
  _aj_symbol_keys: []
- _aj_symbol_keys: []
executions: 1
exception_executions:
  "[StandardError]": 2
locale: en
timezone: US/Eastern
enqueued_at: '2024-01-26T14:28:00.328480009Z'
scheduled_at: '2024-01-26T14:28:00.328170914Z'

View full backtrace and more info at honeybadger.io

eddierubeiz commented 8 months ago

Tagging @archivistsarah so she can follow along.

eddierubeiz commented 8 months ago

This might be the first time we invoke exiftool since https://github.com/sciencehistory/scihist_digicoll/pull/2515 .

eddierubeiz commented 8 months ago
$ heroku run bash  --app scihist-digicoll-production
Running bash on ⬢ scihist-digicoll-production... up, run.2622 (Standard-1X)
~ $ exiftool
bash: exiftool: command not found
eddierubeiz commented 8 months ago

I ran all the "is present" tests in https://github.com/sciencehistory/scihist_digicoll/blob/master/system_env_spec/system_env_spec.rb and the only utility we're missing is exiftool.

eddierubeiz commented 8 months ago

I was going to start rolling back from 3.2.3 to 3.2.2, when I discovered to my surprise that production is not running 3.2.3, but 3.2.2. Having deployed master to staging, we have:

STAGING:

Running ruby --version && exiftool -ver on ⬢ scihist-digicoll-staging... up, run.8006 (Standard-1X)
wait parent timeout: Operation not permitted
ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]
12.70

PRODUCTION:

Running ruby --version && exiftool -ver on ⬢ scihist-digicoll-production... up, run.7931 (Standard-1X)
wait parent timeout: Operation not permitted
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
/bin/bash: line 1: exiftool: command not found
eddierubeiz commented 8 months ago

I am therefore going to deploy the master branch to production, without any further investigations.

eddierubeiz commented 8 months ago

And there we are:

Running ruby --version && exiftool -ver on ⬢ scihist-digicoll-production... up, run.1399 (Standard-1X)
wait parent timeout: Operation not permitted
ruby 3.2.3 (2024-01-18 revision 52bb2ac0a6) [x86_64-linux]
12.70
eddierubeiz commented 8 months ago

Retried the failed job, and this time it worked. https://digital.sciencehistory.org/admin/asset_files/r1zgfze has derivatives.

eddierubeiz commented 8 months ago

Sarah deleted the above file and replaced it with https://digital.sciencehistory.org/admin/works/re9xt8j, btw.

jrochkind commented 8 months ago

OK, we intend to have exiftool installed with custom buildpack

https://github.com/fnando/heroku-buildpack-exiftool

( also doc'd in wiki)

That buildpack IS in our heroku buildpacks on both production and staging, which are currently identical as the should be:

$ heroku buildpacks -r staging

1. heroku/metrics
2. https://github.com/gaffneyc/heroku-buildpack-jemalloc.git
3. heroku/python
4. https://github.com/heroku/heroku-buildpack-activestorage-preview
5. heroku-community/apt
6. https://github.com/brandoncc/heroku-buildpack-vips
7. https://github.com/fnando/heroku-buildpack-exiftool
8. heroku/ruby

Yet Eddie's report is that actual exiftool presence is kind of flip-flopping on our heroku dynos, where it can go away but restarting or re-deploying can bring it back.

This definitely does not match our understanding of how heroku infrastructure ought to work, if nothing else it ought to be reproducible if no changes!

Unless there are changes happening we are not identifying.

At present as I write this, using heroku run bash to both prod and staging to confirm exiftool is there. exiftool currently showing up on both, identically:

~ $ exiftool -ver
12.70
~ $ which exiftool
/app/vendor/exiftool/exiftool
~ $ echo $PATH
/app/bin:/app/vendor/vips/bin:/app/vendor/yarn-v1.22.19/bin/:/app/bin:/app/vendor/bundle/bin:/app/vendor/bundle/ruby/3.2.0/bin:/app/.heroku/python/bin:/app/vendor/exiftool:/app/.apt/usr/bin:/app/.heroku/activestorage-preview/bin:/usr/local/bin:/usr/bin:/bin:/app/vendor/jemalloc/bin

The actual error was likely on a worker dyno, just to make extra certain we will connect to a currently running worker dyno with: heroku ps:exec --dyno=worker.1 -r production

Same result, exiftool is there.

Ran heroku ps:restart worker -r production

ps:exec into worker dynos again -- exiftool is there!

Just to be sure, I also ingested a sample TIFF and mp4 to a test object in production -- it ingested properly, with exiftool-supplied technical metadata showing.

This may have been an odd unreproducible glitch where one particular build of our heroku dynos somehow didn't end up with exiftool? This is still discouraging, we don't want such glitches on heroku, we want everything to always be reproducible (same results every time if no change to system!).

But, we may just have to hope this doesn't happen again? And know that a dyno restart and/or re-deploy will probably fix it if it does?

eddierubeiz commented 8 months ago

If this reoccurs: try debugging by logging directly into the Heroku dyno.