lucahammer / fedifinder

Find fediverse addresses in the profiles of your Twitter followings
https://fedifinder.glitch.me/
MIT License
246 stars 27 forks source link

Handle webfinger 301 #171

Open Chainfire opened 1 year ago

Chainfire commented 1 year ago

Related to #166 but not exactly the same.

My Mastodon handle is @chainfire@chainfire.eu. My private mastodon instance runs at mastodon.chainfire.eu. Fedifinder returns:

chainfire.eu Maybe down, maybe not Fediverse. Error: 404.0
[@chainfire@chainfire.eu](https://chainfire.eu/@chainfire) (ChainfireXDA)

This happens for a bunch of people in my list, and I think for the same reason.

As I understand from the other issue, Fedifinder checks /.well-known/host-meta and /.well-known/nodeinfo which the short chainfire.eu domain does not handle (404). Unlike #166, the short domain chainfire.eu doesn't actually handle /.well-known/webfinger either, but it does 301 permanently redirects it to the long domain mastodon.chainfire.eu, which also serves the host-meta and nodeinfo URLs:

chainfire.eu/@chainfire --> 404
chainfire.eu/.well-known/host-meta --> 404
chainfire.eu/.well-known/nodeinfo --> 404
chainfire.eu/.well-known/webfinger --> 301 --> mastodon.chainfire.eu/.well-known/webfinger
mastodon.chainfire.eu/@chainfire --> 200
mastodon.chainfire.eu/.well-known/host-meta --> 200
mastodon.chainfire.eu/.well-known/nodeinfo --> 200
mastodon.chainfire.eu/.well-known/webfinger --> 200

While it would be trivial for me to redirect @chainfire, host-meta and nodeinfo as well, I believe this should be handled by Fedifinder, for the simple reason that mine is a simple setup outlined by the Mastodon official documentation, that many other private hosters will be following as well. See the docs, LOCAL_DOMAIN and WEB_DOMAIN options, and see the sample nginx adjustment, which I have used verbatim.

I have not looked at your code, and I understand from the other issue there is a cache issue at play, but I would suggest something like this:

This domain remap can be cached for all users of that domain, as 301 is intentionally a permanent move. I am not absolutely sure if a 301 is allowed to vary by query parameters or how browsers normally handle that, but I would suggest that the case where it does not vary by query parameters is significantly more common (due to it being the official Mastodon docs way) than where it does (which would be a completely custom implementation). So the outlined change would give the expected result for most cases.

Thank you for your work, I hope this can be implemented.

EDIT: Updated docs link to the relevant anchor, wording, hopefully improved clarity

rfc1036 commented 1 year ago

If the issue here is only figuring out and caching which domains have a WebFinger endpoint then fedifinder just needs to request $DOMAIN/.well-known/webfinger.

A 400 answer means that the domain does speak WebFinger, a 404 means that it does not and anything else is an implementation error (for which some heuristics could be developed or not, depending how much you care about buggy implementations).

Chainfire commented 1 year ago

I don't think that in particular is the issue at all. It is merely a means to discover a relatively common setup of Mastodon that currently displays an error in Fedifinder.

Handling only 400 and 404 and calling everything else an implementation error may be technically correct, but again, the usage of 301 is advised by the Mastodon docs to support one of their features, they're the most prominent Fediverse player, and the reason most people use this particular tool.

rfc1036 commented 1 year ago

Sure: first thing redirects must be followed (RFC 7033 section 4.2), then the status will tell if that is an actual WebFinger endpoint.

rbairwell commented 1 year ago
  • if not found, check if webfinger (without query parameters) performs a 301 to a different domain

This will break in some situations. If you query the WebFinger for @rbairwell@bairwell.com , it redirects to https://mastodon.org.uk/.well-known/webfinger?resource=acct:rbairwell@mastodon.org.uk - but if you check for https://bairwell.com/.well-known/webfinger/ (or @test@bairwell.com or any "unrecognised" user), you'll get a 400 error (code is from https://www.jwz.org/blog/2022/11/using-your-own-domain-as-a-mastodon-handle/ )