Closed jrochkind closed 6 years ago
We discussed the questions on Slack and agreed on the following:
Additionally, admins (but nobody else) still need some way to download the true original.
@sanfordd the buckets for DZI use the pattern, I believe, chf-dzi-dev
, chf-dzi-staging
, chf-dzi-dev
.
The one you created for this is chf-dev-derivative
, reversed order of 'service level' and function.
Should we make them consistent instead to make things less confusing and more predictable?
Plan to use GraphicsMagick. It seems to be faster than IM. Via direct command line call out -- seems to be possible to use it with much more efficiently underlying calls this way than via MiniMagic, or least more straightforwardly. MiniMagick is only calling command line IM/GM itself anyway.
vips might be great, but there's not as much docs available on it on google, somewhat harder to get running with my knowledge base.
Disk removed.
Refinement and further specification of part of #710.
This task will create additional JPG download options, a compressed TIFF download options, and better thumbnail derivatives as well (including multiple resolutions for srcset delivery)
(This dev task does not include multi-image derivatives like ZIP or multi-page PDF)
We will rewrite the stack create thumbnails job to:
Deployment plan:
customized_derivatives_architecture
branch, with following local_env (has to be changed on app AND jobs server)New code, but old-style derivatives are still in use.
[x] Run
rake chf:derivatives:s3:create
on jobs-prod, to create all new style derivatives. Will take about 24 hours. Remember to usescreen
.[x] Change production local_env to:
create_derivatives_mode: dzi_s3
creating and showing new derivs. Any items created in between now and last step wont' have new style derivs, now run:
rake chf:derivatives:s3:create[lazy]
to create em.[x] if all is looking good, merge
customized_derivatives_architecture
to master, redeploy.[x] Delete all old style derivatives in /var/sufia, reclaim disk space. You are done.
We will provide rake tasks that safely create derivatives, either en-masse or for specified IDs. Such tasks will also optionally either force creation of derivatives (say, if derivative configuration has changed), or only do so lazily for specific derivatives that don't exist.
We will of course then have to change our front-end to find derivatives on S3 instead of local file system. (Previous abstraction work we did should make this easier, at least for user-facing front-end, admin screens could be harder, and we may have to keep creating old-style derivatives for admin screens)
Moving derivative creation to separate server and moving derivative serving to S3 should have performance benefits for our app. S3 should, we think, over the long haul be cheaper than storing derivatives on an EBS, as our number of derivatives/bytes scale up.
note on auth as we did for DZI, we are not providing access controls to derivatives on S3. Which means someone could download a thumbnail or even the original for an asset even if it is marked private in sufia--if they find the URL. While the URLs won't generally be advertised/available, we know this is not actually 'security'. We think this is fine for now, as our non-public images are only temporarily so as works in progress, and not truly confidential or private. See discussion of auth for DZI on S3 at https://github.com/chemheritage/chf-sufia/blob/c94f4fcf578ccc31b22620116f53e1ad71d0d95f/docs/dzi_tiles_on_s3.md#auth----not-yet , applies to this too.
Some UI/UX questions
Should we deliver a compressed (LZW or ZIP) TIFF? YES, at least initially we'll try it, only concern is cost of S3 storage of so many bytes. (not doing this for this release, in the end )
Should clicking on JPGs force download instead of displaying in browser? (yes)
Do we want to make better file names for 'save as' downloads? How should those filenames be generated? (yes, we are doing somewhat friendlier filenames with first three words of work)