crystal-lang / distribution-scripts

40 stars 23 forks source link

Return 404 HTTP status on not-found docs pages #317

Open matiasgarciaisaia opened 2 weeks ago

matiasgarciaisaia commented 2 weeks ago

Due to the redirect rules we have in place to default API Docs to the latest released version (see #312 for some context), we never send a 404 HTTP status to clients when they request non existent pages in the /api/ path.

$ curl -IL https://crystal-lang.org/api/not-exists
HTTP/2 301
content-length: 0
location: https://crystal-lang.org/api/latest/not-exists
date: Fri, 14 Jun 2024 19:27:22 GMT
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 edf0aa2c164e893ed1b6833d81b79846.cloudfront.net (CloudFront)
x-amz-cf-pop: EZE50-P2
x-amz-cf-id: X7yAEV6jg6n6DgKMI4s-rCG-hk_4Le4iKMmjReeGn-eTCfe1iMO0MQ==

HTTP/2 302
content-length: 0
location: https://crystal-lang.org/api/1.12.2/not-exists
date: Fri, 14 Jun 2024 19:27:22 GMT
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 edf0aa2c164e893ed1b6833d81b79846.cloudfront.net (CloudFront)
x-amz-cf-pop: EZE50-P2
x-amz-cf-id: PoqQhBeoJmfLQh4LQTE58j9SxrfHO__2s0VVtboiSDGN0EmVO-muLw==

HTTP/2 302
content-length: 0
location: https://crystal-lang.org/api/1.12.2/404.html
date: Fri, 14 Jun 2024 19:27:22 GMT
server: AmazonS3
x-cache: Miss from cloudfront
via: 1.1 edf0aa2c164e893ed1b6833d81b79846.cloudfront.net (CloudFront)
x-amz-cf-pop: EZE50-P2
x-amz-cf-id: BntkiMMZMqbjcPRzveIKCxg7rvWCel1PTc6kXPr7nCHgkinc9p4B1w==

HTTP/2 200
content-type: text/html
content-length: 110663
date: Fri, 14 Jun 2024 19:27:23 GMT
last-modified: Fri, 31 May 2024 10:13:48 GMT
etag: "be683991dfe3b73ca6a99cff52c98f21"
server: AmazonS3
vary: Accept-Encoding
x-cache: Miss from cloudfront
via: 1.1 edf0aa2c164e893ed1b6833d81b79846.cloudfront.net (CloudFront)
x-amz-cf-pop: EZE50-P2
x-amz-cf-id: jusQUDz88aAWrPNKps0eyibSzu0L4PaO211-D2ph5X23fSIGb4u_-A==

We should improve the routes handling so we eventually respond with 404 Not found HTTP Status (even if that's after some redirects).

matiasgarciaisaia commented 2 weeks ago

I don't think this is solvable with the S3 bucket redirection rules - we would need to be able to say that the rules we have don't apply to /api/1 and /api/0 paths, so S3 serves the error document (which is set to be the /api/1.12.1/404.html page).

I'll check if this is doable with a lambda function - and if its cost fits within our AWS budget.

matiasgarciaisaia commented 2 weeks ago

I think we can replace the patchwork of S3 rules for only one rule redirecting /api/latest -> /api/${current_version} if we add this javascript snippet to the 404.html page:

<script type="text/javascript">
  if(!document.location.pathname.match(/^\/api\/[0-9]/)) {
    const new_path = document.location.pathname.replace(/^\/api\//, "/api/latest/")
    document.location.pathname = new_path
  }
</script>

That way, visiting /api/String.html in a browser will redirect users to /api/latest/String.html, which gives a 302 Temporary redirect to /api/${current_version}/String.html - a 200 OK. But visiting /api/not-exists ends up at /api/${current_version}/not-exists, which returns a 404 Not Found.

This only works on browsers with JavaScript enabled. Other user agents will miss the /api/String.html -> /api/latest/String.html redirection.

What do you think?

straight-shoota commented 2 weeks ago

I'm strongly opposed to use client-side JavaScript to overpaint technical limitations of the web server. Implementing redirection is a responsibility of the HTTP server, not the client.