Open strogonoff opened 6 years ago
Thanks @strogonoff ! The code currently converts "/XXX" with "/XXX/index.html", but in our static Jekyll sites, there are two possibilities:
I wonder what's the best way to do so?
cc: @ribose-jeffreylau
Actually it's easy to do so with the following modified code:
/XXX
becomes /XXX.html
/XXX/
becomes /XXX/index.html
'use strict';
const pointsToFile = uri => /\/[^/]+\.[^/]+$/.test(uri);
const hasTrailingSlash = uri => uri.endsWith('/');
exports.handler = (event, context, callback) => {
// Extract the request from the CloudFront event that is sent to Lambda@Edge
var request = event.Records[0].cf.request;
// Extract the URI and query string from the request
const olduri = request.uri;
const qs = request.querystring;
if (pointsToFile(olduri)) {
callback(null, request);
return;
}
// Append ".html" extension
if (!hasTrailingSlash(olduri)) {
request.uri = uri + ".html";
} else {
// Append "index.html"
request.uri = uri + "index.html";
}
// Return to CloudFront
return callback(null, request);
};
@ronaldtse
You’re right, overall this could indeed cause issues with some Jekyll sites, although it didn’t in mine, which didn’t use collections. Tangentially, I found that hooking into Jekyll’s Ruby plugin architecture and generating pages/paths from custom YAML structure as needed provides the required flexibility, while collections are limiting and only suitable for blog-like sites.
In the end it might not make sense to design a one-size-fits-all function and instead leverage Terraform’s architecture to supply the best simplest function for each specific site (e.g., Ribose Open might end up using one, and the static site another, if any). I’ll test one for Ribose Open specifically.
The code currently converts "/XXX" with "/XXX/index.html",
It’s a technicality but the code will not straight up convert /XXX to /XXX/index.html. The code is supposed to treat /XXX as a path that is missing a trailing slash, and therefore redirect user from /XXX to /XXX/ (this ensures each canonical URL is the one with the slash, so that both third parties won’t get 404 if they forget a slash, and search engines don’t get confused with same content available both with and without slash). The subsequent request to /XXX/, though, is supposed to get rewritten to /XXX/index.html when CF queries S3 origin.
Indeed, it would be ideal to do it as you described. Hooking in with a Jekyll plugin could work.
The point is we need to be consistent in naming foo/index.html
because the Jekyll site structure by default uses foo.html
, which is ambiguous for foo
vs foo/
. If we can say for certainty that everything else is foo/index.html
then it is easy. Maybe that is something you can enforce.
In fact, if Lambda@CF is able to query S3 to see whether foo.html
or foo/index.html
exists, the function can point it to the correct path.
To clarify, the reason Lambda is well-suited for this is that you probably don’t want to tie site generation logic to any particular hosting. Might be better to have any adapters required by AWS within AWS itself on same abstraction level, if that’s possible.
By the way this doesn’t seem to be an urgent problem (unless I’m mistaken) so I put this item on hold for now. If anyone’s willing feel free to implement this.
What I would do when I have the time is test this setup within concrete full Terraform project for a vanilla Jekyll+S3/CF site, and also check if the configuration is complex enough to warrant a module. (Terraform best practices discourage splitting logic across reusable modules where the stack is simple. If this is only to enable collaboration without sharing credentials, I suspect there may be better ways of doing that than moving everything to modules.) Then I’d iterate on Lambda code and periodically re-provision everything from scratch to ensure it all works properly.
@eugenetaranov would you have time to integrate this? Thanks!
@ronaldtse I've a bootstrap html website and I need to remove the .html file extensions to occur on my website urls. I used your above code in Lambda@Edge but they are not working in the live feature. Is there anything to keep in mind while implementing your lambda code or there's an update on it for node 10.x. Please help!
@strogonoff Thank you for the example!
Btw, that tinyendian link seems to broken and pointing to various spam/malicious websites. Consider updating the original issue description to remove that link to keep people safe :)
@mtoorop-ximedes Thanks for noticing. The link has been updated. Cheers!
Rewriting URIs at CloudFront’s request from origin could ensure that:
Function
Setup
The function can use NodeJS runtime 8.10 and needs to be connected (specifying the exact version) to Origin Request Lambda function in CF Origin settings.
Resources