sciencehistory / chf-sufia

sufia-based hydra app
Other
9 stars 4 forks source link

PDF thumbnail conversion command is failing #1168

Closed eddierubeiz closed 6 years ago

eddierubeiz commented 6 years ago

Several oral history PDF's (r494vm198; gh93h041p; n583xw033; 8623hz61d; 5d86p124w; rn3012327 ) are causing create_derivatives_on_s3_service to fail.

The failing command is: convert -density 400 -colorspace sRGB -thumbnail 206x -unsharp 0x3.0 -define jpeg:size\=416x -alpha remove -quality 85 -bordercolor \#050939 -border 1 /tmp/chf-sufia/derivatives-working/fileset_v692t709j_20181012-10097-130gwwz/original\[0\] /tmp/chf-sufia/derivatives-working/fileset_v692t709j_20181012-10097-130gwwz/thumb_standard.jpg

The Honeybadger error is : https://app.honeybadger.io/projects/53196/faults/39894799

eddierubeiz commented 6 years ago

Imagemagick doesn't seem to be using ghostscript.

I suspect we just need to recompile imagigemagick --with-gslib=yes.

$ convert -list configure | grep DISTCHECK_CONFIG_FLAGS

DISTCHECK_CONFIG_FLAGS 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' 'CPPFLAGS=-D_FORTIFY_SOURCE=2' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro' --disable-deprecated --with-quantum-depth=16 --with-umem=no --with-autotrace=no --with-dps=no --with-fpx=no --with-gslib=no --with-fontpath= --with-gs-font-dir=/usr/share/fonts/type1/gsfonts --with-gvc=no --with-perl=no

eddierubeiz commented 6 years ago

https://gist.github.com/leomelzer/3949356

jrochkind commented 6 years ago

Is it failing on all PDFs, or just certain ones?

Cause I know I tested this on staging (not sure about production). Is ghostscript only needed for some PDFs?

Or has our server environment changed since I tested it maybe?

sanfordd commented 6 years ago

We do have ghostscript on staging, though it's as part of the ubuntu default a not an explicit package.

eddierubeiz commented 6 years ago

Recompiling imagemagick from scratch on Staging solved the problem, so ghostscript is indeed the problem.

/root/recompile_imgmagick.sh

I re-ingested one of the pdf's and thumbnails were created properly: https://staging.digital.sciencehistory.org/works/k0698835k

eddierubeiz commented 6 years ago

You can regenerate derivatives on a fileset like this:

(become root)

$ cd /opt/sufia-project/current
$ bundle exec rails c production

irb(main) > file_set = FileSet.find("z029p569t")
irb(main) > file_set_id = file_set.files[0].id
irb(main) > CHF::CreateDerivativesOnS3Service.new(file_set, file_set_id).call

I fixed these two in this way: https://staging.digital.sciencehistory.org/works/8336h285x https://staging.digital.sciencehistory.org/works/ms35t9669

jrochkind commented 6 years ago

Sweet! Is the imagemagick compilation issue fixed in the ansible build, so future ingests will work?

eddierubeiz commented 6 years ago

Nope! I also haven't touched the problem in production yet. We can confer today about what steps to take next. I didn't want to go too crazy without consulting yous. For now I'm just going to turn /root/recompile_imgmagick.sh into an Ansible role.

eddierubeiz commented 6 years ago

See https://bitbucket.org/ChemicalHeritageFoundation/ansible-inventory/branch/compile_imagemagick_from_scratch for an Ansible role that compiles Imagemagick from scratch.

eddierubeiz commented 6 years ago

I adapted the imagemagick role (with advice and support from Dan-- thanks!) and in the process updated ImageMagick to 7.0.?. We ran the update in production this morning and the convert command works again with PDF. I then went back and regenerated the derivatives for the following oral history PDFs: https://digital.sciencehistory.org/works/r494vm198 https://digital.sciencehistory.org/works/gh93h041p https://digital.sciencehistory.org/works/n583xw033 https://digital.sciencehistory.org/works/8623hz61d https://digital.sciencehistory.org/works/5d86p124w https://digital.sciencehistory.org/works/rn3012327

eddierubeiz commented 6 years ago

The end.