sciencehistory / chf-sufia

sufia-based hydra app
Other
9 stars 4 forks source link

Improved single-image derivatives #886

Closed jrochkind closed 6 years ago

jrochkind commented 7 years ago

Refinement and further specification of part of #710.

This task will create additional JPG download options, a compressed TIFF download options, and better thumbnail derivatives as well (including multiple resolutions for srcset delivery)

(This dev task does not include multi-image derivatives like ZIP or multi-page PDF)

We will rewrite the stack create thumbnails job to:

  1. Be able to run on a a separate server from app, our jobs server (fetching originals from fedora)
  2. possibly use vips instead of imagemagick for some operations, or at least have that ability to easily try it out (vips is already installed on our jobs server for dzi creation)
  3. store derivatives on S3 instead of local file system
  4. Ensure clean-up of any temporary files created in process of derivative generation (existing stack code has at least some problems there)

Deployment plan:

New code, but old-style derivatives are still in use.


We will provide rake tasks that safely create derivatives, either en-masse or for specified IDs. Such tasks will also optionally either force creation of derivatives (say, if derivative configuration has changed), or only do so lazily for specific derivatives that don't exist.

We will of course then have to change our front-end to find derivatives on S3 instead of local file system. (Previous abstraction work we did should make this easier, at least for user-facing front-end, admin screens could be harder, and we may have to keep creating old-style derivatives for admin screens)

Moving derivative creation to separate server and moving derivative serving to S3 should have performance benefits for our app. S3 should, we think, over the long haul be cheaper than storing derivatives on an EBS, as our number of derivatives/bytes scale up.

note on auth as we did for DZI, we are not providing access controls to derivatives on S3. Which means someone could download a thumbnail or even the original for an asset even if it is marked private in sufia--if they find the URL. While the URLs won't generally be advertised/available, we know this is not actually 'security'. We think this is fine for now, as our non-public images are only temporarily so as works in progress, and not truly confidential or private. See discussion of auth for DZI on S3 at https://github.com/chemheritage/chf-sufia/blob/c94f4fcf578ccc31b22620116f53e1ad71d0d95f/docs/dzi_tiles_on_s3.md#auth----not-yet , applies to this too.

Some UI/UX questions

MDiMeo commented 7 years ago

We discussed the questions on Slack and agreed on the following:

jrochkind commented 7 years ago

Additionally, admins (but nobody else) still need some way to download the true original.

jrochkind commented 7 years ago

@sanfordd the buckets for DZI use the pattern, I believe, chf-dzi-dev, chf-dzi-staging, chf-dzi-dev.

The one you created for this is chf-dev-derivative, reversed order of 'service level' and function.

Should we make them consistent instead to make things less confusing and more predictable?

jrochkind commented 7 years ago

Plan to use GraphicsMagick. It seems to be faster than IM. Via direct command line call out -- seems to be possible to use it with much more efficiently underlying calls this way than via MiniMagic, or least more straightforwardly. MiniMagick is only calling command line IM/GM itself anyway.

vips might be great, but there's not as much docs available on it on google, somewhat harder to get running with my knowledge base.

sanfordd commented 6 years ago

Disk removed.