ucsdlib / damsmanager

DAMS Manager
Other
3 stars 1 forks source link

PDF derivatives showing up as completely black upon ingest #114

Closed remerjohnson closed 7 years ago

remerjohnson commented 7 years ago

We have a problem of dark PDFs having completely black derivatives being generated upon ingest. If you click through to access the PDF in the viewer, it shows the correct document.

Initially, we suspected this was because the size of the PDFs was too large, but this problem still occurs with these much smaller PDFs.

The only pattern I can see from the ingest is that the problematic PDFs are very dark by nature.

Is there possibly a setting with the PDF generating library that we can tweak to not be so sensitive to the amount of black in PDFs? Thanks.

screenshot-1

lsitu commented 7 years ago

@rstanonik I created the thumbnail for PDF http://library.ucsd.edu/dc/object/bb1504296t on my Mac and I don't see the problem (see attachment). Do you have any thoughts regarding this issue? I see the GS version on QA is "GPL Ghostscript 8.70 (2009-07-31)", which is old. Could we install a newer version of GS on QA to see how it goes? Thanks.

Here is version of ImageMagick and GS, and the convert command I use: $ convert -version Version: ImageMagick 6.9.3-0 Q16 x86_64 2016-02-23 http://www.imagemagick.org

$ gs -v GPL Ghostscript 9.18 (2015-10-05) Copyright (C) 2015 Artifex Software, Inc. All rights reserved.

$ convert -auto-orient -resize 450x450 bb1504296t_1.pdf[0] bb1504296t_1.jpg

bb1504296t_1

lsitu commented 7 years ago

@rstanonik Have you got a chance to look into the issue of upgrading Ghostscript? Thanks.

gamontoya commented 7 years ago

@lsitu Heard from Ron and he said he hasn't been following Github issues, but he will look for an updated version of GhostScript as you suggested.

Can you ping Ron early next week to remind him to look into this? I'll try to as well.

lsitu commented 7 years ago

@gamontoya Sure. Thanks for tracking this down. Sorry @rstanonik. I haven't realized that but we've changed to use Github issues for a little while already.

rstanonik commented 7 years ago

I replaced ghostscript on lib-hydatail-qa, lib-hydratail-staging, and lib-hydratail-prod and confirmed that

convert -auto-orient -resize 450x450 bb1504296t_1.pdf[0] bb1504296t_1.jpg

was no longer just a black box.

Give it a try and let me know.

Ron

From: Gabriela A. Montoya [mailto:notifications@github.com] Sent: Friday, December 09, 2016 3:54 PM To: ucsdlib/damsmanager damsmanager@noreply.github.com Cc: Stanonik, Ronald rstanonik@ucsd.edu; Mention mention@noreply.github.com Subject: Re: [ucsdlib/damsmanager] PDF derivatives showing up as completely black upon ingest (#114)

@lsituhttps://github.com/lsitu Heard from Ron and he said he hasn't been following Github issues, but he will look for an updated version of GhostScript as you suggested.

Can you ping Ron early next week to remind him to look into this? I'll try to as well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/ucsdlib/damsmanager/issues/114#issuecomment-266155740, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACjXoJL_YO5v_62qmgdGox6rlOt5yy5hks5rGeongaJpZM4LAeKd.

gamontoya commented 7 years ago

@lsitu Looks like you're good to go to test out now.

lsitu commented 7 years ago

Thanks @rstanonik . That sounds promising. @gamontoya I am on it now.

lsitu commented 7 years ago

@gamontoya, @remerjohnson, @rstanonik I've created a couple of examples on QA and both looks good: http://libraryqa.ucsd.edu/dc/object/bd1381047p http://libraryqa.ucsd.edu/dc/object/bd7251406d

I think we can move ahead to make the change on staging and prod. What do you think?

remerjohnson commented 7 years ago

@lsitu Would you need a list of ARKs of the corrupt images to replace?

lsitu commented 7 years ago

@remerjohnson How about just replacing the thumbnail/derivatives for all 45 objects in that collection?

remerjohnson commented 7 years ago

@lsitu Ah... okay, that would be a way to catch everything

remerjohnson commented 7 years ago

@rstanonik I think we are good to go. Do you need anything else from us?

lsitu commented 7 years ago

@rstanonik Could we upgrade GS on staging and prod as well when you got a chance? Please let me know once it's done and I'll regenerate the PDF thumbnails on prod. Thanks.

mcritchlow commented 7 years ago

@remerjohnson @lsitu @rstanonik - Is #119 the remaining work here? Or do we still need this ticket open for further work/testing?

lsitu commented 7 years ago

@mcritchlow For this ticket, I think we may still need to update GS on staging and prod, and then I'll recreate the thumbnails for the PDFs in trouble. We'll deal with #119 as a separate ticket. What do you think @remerjohnson @rstanonik ?

lsitu commented 7 years ago

Thanks @rstanonik for upgrading GS on staging and prod. @remerjohnson, I've recreated the thumbnails for the PDFs in collection UC San Diego General Catalog . It seems like that all those black thumbnails are gone. Could you take a look?

remerjohnson commented 7 years ago

@lsitu Yup, these look great. Thanks for all your help, and also @rstanonik