Closed GeoffFroh closed 2 years ago
I read it through and feel like I need to read it again to make more sense of it. It feels like a lot of moving pieces just for file downloads, and it's dependent on lots of fiddly-looking BB and CF configs that we'd have to keep track of.
I guess if we go with this we should put pointers to documentation in comments in our Nginx configs and in our Ansible playbooks.
This use-case reminds me of the Nginx as object storage gateway article I saw recently. That requires the paid version of Nginx but I feel like there should be other ways. Something like that would keep more of it in our Nginx and Ansible configs where it's easier to keep track of.
Here's another option: When I was at JANM I wrote a Django URL shortener that they ended up not using. (This was a tme when some vendor tried to sell them an expensive QRcode service and I hacked together a similar thing in an hour). A URL shortener would give users a nice URL when they right-clicked on the links, and we could print a "short URL" they could use for copying. Wouldn't help if they click on the image and opened it in their browser tho.
Note: We already have assets.densho.org
and it looks like we can already do semi-nice URLs like this:
https://f001.backblazeb2.com/file/densho-public/ddr-densho-266/ddr-densho-266-1-mezzanine-62a8d0876e-a.jpg
--> https://assets.densho.org/file/densho-public/ddr-densho-266/ddr-densho-266-1-mezzanine-62a8d0876e-a.jpg
Following all the directions would get us this: https://assets.densho.org/ddr-densho-266/ddr-densho-266-1-mezzanine-62a8d0876e-a.jpg
Which is a bit shorter but not that much
Looks like there's a possible workaround for the Nginx storage gateway thing: https://tenzer.dk/nginx-with-dynamic-upstreams/
Not really mentioned in TFA above but not incurring egress charges is kinda the point so we do need to make the BB->CF thing work.
OK I've got many of the Cloudflare transform rules in place, except for the final cache control one, and the rest of the Densho web presence is not breaking. The test image is now available from: https://downloads.densho.org/ddr-densho-266/ddr-densho-266-9-mezzanine-818eb5aa3e-a.jpg
I think I've got the caching to work. Used DDR.storage.Backblaze
to update the cache-control
settings for the densho-public
bucket.
Enabling this involes modifying the Ansible inventory
to change the value of ddrpublic_backblaze_bucket_url=https://f001.backblazeb2.com/file/densho-public/
to
ddrpublic_backblaze_bucket_url=https://downloads.densho.org/
.
This change was rolled out to ddrstage.densho.org
.
I've added documentation of the changes to ansible-colo/proxy.yml
with a pointer in ansible-colo/templates/proxy/ddrpublic.conf.j2
.
Looks like this got reverted somehow.
Example: download links on https://ddr.densho.org/ddr-densho-266-9/
In ansible-colo
commit 615d5ef
changed config ddrpublic_backblaze_bucket_url
to https://downloads.densho.org/
. This worked, but it looks like it was always set to the Backblaze URL and I couldn't find commits where it was ever downloads.densho.org
. It was working before so... ¯_(ツ)_/¯ ?
Full-size image downloads are served out of B2, through the CF CDN edge service. Currently, the urls are direct to the B2 bucket. E.g.,
https://f001.backblazeb2.com/file/densho-public/ddr-hmwf-1/ddr-hmwf-1-15-mezzanine-720d3e84b0.tif
It is possible to use CF "Transform Rules" to modify the url and the http headers, and allow for friendlier (and vendor-agnostic) urls that point to a
densho.org
subdomain. Background article here: https://www.backblaze.com/blog/free-image-hosting-with-cloudflare-transform-rules-and-backblaze-b2/downloads.densho.org
)ddr-public
codebase