Closed tslocum closed 3 years ago
Could we also allow serving it with redirection?
Unfortunately I'm pretty convinced now that I was completely wrong when I wrote that test.
The main problem with the root redirect is that it's reasonable to expect a client to use a URL normalization library for caching or resolving URLs like this gemini://example.com/path/subpath/..
to gemini://example.com/path/
before making their request.
Because gemini://example.com/
and gemini://example.com
are canonically the same, the normalization could go either way. So if you redirect to gemini://example.com/
, the client might reasonably strip off the trailing slash and make the same request, leading to an infinite redirect loop.
To give a real example, I ran into this when I was working on my mozz-archiver tool. My crawler was caching gemini responses to prevent requesting the same page twice. I hit "gemini://mozz.us" and cached the response containing the redirect. But immediately after that, the crawler normalized "gemini://mozz.us/" to "gemini://mozz.us" and marked it as already seen.
As discussed with you on the mailing list, RFC 3986, section 6.2.3 says "In general, a URI that uses the generic syntax for authority with an empty path should be normalized to a path of "/"." Note the "in general". The rest of section 6.2.3 explains that is is scheme-specific.
gemini://gemini.circumlunar.space/docs/specification.gmi seems silent about "gemini:" rules. 1.2 says "The path, query and fragment components are allowed and have no special meanings beyond those defined by the generic syntax."
So, there is apparently no solid basis to say that gemini://example.com
and gemini://example.com/
must have the same behavior.
Hi @bortzmeyer. I totally agree with your findings based on the RFC, but the goal of this tool was never limited to only checking what is or isn't allowed by the gemini spec.
usage: gemini-diagnostics [host] [port] [--help]
A diagnostic tool for gemini servers.
This program will barrage your server with a series of requests in
an attempt to uncover unexpected behavior. Not all of these checks
adhere strictly to the gemini specification. Some of them are
general best practices, and some trigger undefined behavior. Results
should be taken with a grain of salt and analyzed on their own merit.
If you have an argument for why this shouldn't be a "best practice" I'm interested to hear about it.
For the record I don't think that gemini servers should necessarily strive to pass all of these tests. I added gemini:// support to pygopherd and it fails like half of these because the implementation was way less complicated that way. But I think it's a good tool to help uncover edge cases that one might run into when writing a server.
In that case, I would suggest to have several levels of checking like, for instance, many compilers do. Something like:
--level lax
--level medium
--level strict
(or pedantic
:-) This specific test would be run only with --level strict
.
Regarding the "empty path is single-slash" issue, there is a proposal in the specification issue tracker: https://gitlab.com/gemini-specification/gemini-text/-/issues/2
The spec has been updated to mandate the same behavior for "/" and ""
The new specification explicitly mentions this case, and mandates that an empty path and a path of "/" are the same.
https://gitlab.com/gemini-specification/protocol/-/issues/11
Allowing serving the root URL without redirection makes sense. Could we also allow serving it with redirection?