denshoproject / ddr-public

Web UI for publishing DDR collections.
Other
1 stars 1 forks source link

Make full size image download urls point to a densho.org subdomain, instead of B2 bucket #192

Closed GeoffFroh closed 1 year ago

GeoffFroh commented 2 years ago

Full-size image downloads are served out of B2, through the CF CDN edge service. Currently, the urls are direct to the B2 bucket. E.g.,

https://f001.backblazeb2.com/file/densho-public/ddr-hmwf-1/ddr-hmwf-1-15-mezzanine-720d3e84b0.tif

It is possible to use CF "Transform Rules" to modify the url and the http headers, and allow for friendlier (and vendor-agnostic) urls that point to a densho.org subdomain. Background article here: https://www.backblaze.com/blog/free-image-hosting-with-cloudflare-transform-rules-and-backblaze-b2/

gjost commented 2 years ago

I read it through and feel like I need to read it again to make more sense of it. It feels like a lot of moving pieces just for file downloads, and it's dependent on lots of fiddly-looking BB and CF configs that we'd have to keep track of.

I guess if we go with this we should put pointers to documentation in comments in our Nginx configs and in our Ansible playbooks.

This use-case reminds me of the Nginx as object storage gateway article I saw recently. That requires the paid version of Nginx but I feel like there should be other ways. Something like that would keep more of it in our Nginx and Ansible configs where it's easier to keep track of.

Here's another option: When I was at JANM I wrote a Django URL shortener that they ended up not using. (This was a tme when some vendor tried to sell them an expensive QRcode service and I hacked together a similar thing in an hour). A URL shortener would give users a nice URL when they right-clicked on the links, and we could print a "short URL" they could use for copying. Wouldn't help if they click on the image and opened it in their browser tho.

gjost commented 2 years ago

Note: We already have assets.densho.org and it looks like we can already do semi-nice URLs like this: https://f001.backblazeb2.com/file/densho-public/ddr-densho-266/ddr-densho-266-1-mezzanine-62a8d0876e-a.jpg --> https://assets.densho.org/file/densho-public/ddr-densho-266/ddr-densho-266-1-mezzanine-62a8d0876e-a.jpg

Following all the directions would get us this: https://assets.densho.org/ddr-densho-266/ddr-densho-266-1-mezzanine-62a8d0876e-a.jpg

Which is a bit shorter but not that much

gjost commented 2 years ago

Looks like there's a possible workaround for the Nginx storage gateway thing: https://tenzer.dk/nginx-with-dynamic-upstreams/

gjost commented 2 years ago

Not really mentioned in TFA above but not incurring egress charges is kinda the point so we do need to make the BB->CF thing work.

gjost commented 2 years ago

OK I've got many of the Cloudflare transform rules in place, except for the final cache control one, and the rest of the Densho web presence is not breaking. The test image is now available from: https://downloads.densho.org/ddr-densho-266/ddr-densho-266-9-mezzanine-818eb5aa3e-a.jpg

gjost commented 2 years ago

I think I've got the caching to work. Used DDR.storage.Backblaze to update the cache-control settings for the densho-public bucket.

Enabling this involes modifying the Ansible inventory to change the value of ddrpublic_backblaze_bucket_url=https://f001.backblazeb2.com/file/densho-public/ to ddrpublic_backblaze_bucket_url=https://downloads.densho.org/. This change was rolled out to ddrstage.densho.org.

I've added documentation of the changes to ansible-colo/proxy.yml with a pointer in ansible-colo/templates/proxy/ddrpublic.conf.j2.

gjost commented 2 years ago

Looks like this got reverted somehow.

Example: download links on https://ddr.densho.org/ddr-densho-266-9/

gjost commented 1 year ago

In ansible-colo commit 615d5ef changed config ddrpublic_backblaze_bucket_url to https://downloads.densho.org/. This worked, but it looks like it was always set to the Backblaze URL and I couldn't find commits where it was ever downloads.densho.org. It was working before so... ¯_(ツ)_/¯ ?