Change to some URL format that instead ends in "image.jpg".
The Challenge
S3 stores the derived images under a path like /2001/12-31/image.jpg/VERSION/jpeg/1024. When an image is deleted, the system cleans up by deleting everything under the path /2001/12-31/image.jpg.
In order to continue to be able to delete by path, I would need to keep the S3 file structure and transform the path somewhere between the CDN and S3. The most appropriate technology is CloudFront Functions, which is designed to do exactly this sort of thing. Do not use Lambda @ Edge, which are much higher latency and also overkill for this situation.
My concern: I'm unsure how much latency a CloudFront Function would add to the request -- I don't believe the "sub-millisecond" claim AWS makes but also I don't believe the 100's of ms I see in another blog.
I guess the thing to do is measure current performance, add a CloudFront Function, measure again. The CloudFront Function doesn't even have to do the rewrite, I just need some code to execute.
URL Format Options
I'm going with Option 2 because its cons turned out to be invalid. It's just the better option.
Option 1
/i/jpeg/1024/VERSION/2001/12-31/image.jpg
Pros:
Does not use query string, which is a bit of an antipattern with CDNs
Shorter URLs than Option 2
It's easy to write a regular expression that picks the/VERSION/2001/12-31/image.jpg off the end of the path, because that's a very well known structure.
It keeps the processing directives (jpeg/1024) together, so I don't have to parse them at all, which is good because they're hard to parse what with the indeterminate number of / and commands. If they're together like this I just have to move them as a block to the end of the URL
More human-readable than Option 1, puts parameter-y things in query string parameters like you'd expect
Feels more like an API than Option 1
Cons:
CDN will need to cache with query string, which is a bit of an anti-pattern
NOT TRUE: I rewrite the query string to a path BEFORE the caching logic
The order of the query string parameters must be consistent or else it will be cached multiple times
NOT TRUE: I rewrite the query string to a path BEFORE the caching logic
The CloudFront Function would have to understand the processing directive structure, making one more system that has to change if the processing directives change.
NOT A REASON: the processing directives are super simple
Longer URLs than Option 1
Error handling
Return 404 if no VERSION or no size, without going to S3. This is nice because if there's no version, currently S3 will try to look up in the root bucket, and I don't want that.
Should I change the S3 bucket path?
Probably not
Using the Cloudfront Function, I could remove the initial /i/ and store the images at the root of the S3 bucket like /2001/12-31/image.jpg/VERSION/...
However, I should leave the /i/: it allows me to switch between the old and new implementations, or even change CDN providers at some point without regenerating any derived images.
When you save a derived image from the browser, it saves with the filename "1024".
The Issue
This happens because the URL ends in "
1024
":The Solution
Change to some URL format that instead ends in "
image.jpg
".The Challenge
S3 stores the derived images under a path like
/2001/12-31/image.jpg/VERSION/jpeg/1024
. When an image is deleted, the system cleans up by deleting everything under the path/2001/12-31/image.jpg
.In order to continue to be able to delete by path, I would need to keep the S3 file structure and transform the path somewhere between the CDN and S3. The most appropriate technology is CloudFront Functions, which is designed to do exactly this sort of thing. Do not use Lambda @ Edge, which are much higher latency and also overkill for this situation.
My concern: I'm unsure how much latency a CloudFront Function would add to the request -- I don't believe the "sub-millisecond" claim AWS makes but also I don't believe the 100's of ms I see in another blog.
I guess the thing to do is measure current performance, add a CloudFront Function, measure again. The CloudFront Function doesn't even have to do the rewrite, I just need some code to execute.
URL Format Options
I'm going with Option 2 because its cons turned out to be invalid. It's just the better option.
Option 1
/VERSION/2001/12-31/image.jpg
off the end of the path, because that's a very well known structure.jpeg/1024
) together, so I don't have to parse them at all, which is good because they're hard to parse what with the indeterminate number of / and commands. If they're together like this I just have to move them as a block to the end of the URLOption 2
Error handling
Should I change the S3 bucket path?
Probably not
/i/
and store the images at the root of the S3 bucket like/2001/12-31/image.jpg/VERSION/...
/i/
: it allows me to switch between the old and new implementations, or even change CDN providers at some point without regenerating any derived images.