Redirecting with x-amz-website-redirect-location

lackac commented 11 years ago

I'm in the process of migrating a php site to Amazon S3 with jekyll and this gem. I'm looking into defining redirection rules for the old endpoints to their new locations. I know about routing_rules but I did not want to list all the files in my config file. I wonder if it would be possible to support the x-amz-website-redirect-location header for permanent object redirects. See the AWS Documentation for more details.

I see several possible approaches:

1. List redirecting endpoints at the target endpoint's front-matter

Example:


---
layout: default
title: The New Beautiful Site
redirect_from:
  - /index.php

---

This has the benefit of being easy on the user. However, with this approach it's not possible to define redirects for images and other objects without a front-matter.

2. Add support for defining arbitrary headers for any objects

These headers could be listed in *.headers files. Jekyll would copy these to the _site folder and jekyll-s3 could use them to PUT the files with these custom headers to S3. If only a file.ext.headers file exists without a corresponding file.ext file jekyll-s3 would upload a zero-byte object with the headers.

Example file.ext.headers file:

x-amz-website-redirect-location: /about.html

This approach has the benefit of being quite general. It could be used for even more things like custom encoding or ACL. However, if you only want to use this for redirects you might end up with a lot of single-line dummy .headers files lingering in your source.

3. Simpler syntax for defining 301 redirects in `_jekyll_s3.yml`

If you only want to list a bunch of 301 redirects in the config file then the routing_rules based approach can become quite cumbersome. Potentially you will need to add a bunch of these:

  - condition:
      key_prefix_equals: index.php
    redirect:
      replace_key_with: index.html
      http_redirect_code: 301

jekyll-s3 could have a simpler syntax for defining simple redirects (where full key equality needs to be checked and 301 redirect is desired). Consider this instead:

redirects:
  index.php: index.html
  about.php: about.html
  pages/faqs.php: faq.html

jekyll-s3 would create zero-byte objects with the correct x-amz-website-redirect-location headers for these paths.

For my use case the first or third approach would be the best. What do you think?

I haven't looked into the source of jekyll-s3 yet, so I have no idea which one of these would be the easiest to implement. I'm happy to help with the implementation though.

laurilehmijoki commented 11 years ago

All the three of the approaches have their own benefits when comparing to others. In essence, they are just different ways of expressing the desire to create an S3 object with a metadata entry.

An idea regarding the second approach

Regarding your second approach, what do you think of expressing the same thing in the _jekyll_s3.yml config file instead of a file.ext.headers file? We could have a top-level key s3_object_metadata in the config file, and there we could define the object keys and metadata for each key. For example:

s3_object_metadata:
  index.php:
    x-amz-website-redirect-location: index.html
    x-amz-meta-My-Custom-Metadata: quux
  about.php:
   x-amz-website-redirect-location: about.html

I think it's nice to have the "real" data (i.e., the contents of your website) in the Jekyll source files and all the configuration in a dedicated file, _jekyll_s3.yml.

The first versus the third approach

I prefer the third approach to the first one, because in the third one we have the deployment configuration separated from the content. Deployment data are clearly distinct from page contents data, and thus, in my humble opinion, keeping them separate on the file system level brings clarity. You can, however, look at this from another point of view and end up with a different conclusion.

What next?

If the first and third approach suit the best, I'd start with third approach. I've stated the reason above.

I suggest that we create an API into this gem that allows one to set arbitrary metadata on an S3 object that may or may not exist. Once we have that API, we can then implement all the three approaches, because they are just different ways of defining object metadata.

I might not have the time to help you with the implementation, but if I do, I will post a message into this issue. We can also work such that you provide the implementation while I write the Cucumber examples. But again, the most certain way to get this feature into a released gem is to provide both the implementation and the tests (RSpec + Cucumber). What do you think?

lackac commented 11 years ago

Thanks for the detailed reply. I like your proposal about s3_object_metadata. It solves the pain point of the second approach.

So here's the plan then:

I'll start with implementing the API for setting arbitrary metadata
I'll move on to implement the solution with s3_object_metadata
After that I'll consider implementing the third approach as well for those of us who only want to use this for redirects

laurilehmijoki / jekyll-s3