Closed steve-bate closed 1 month ago
According to Section 4.1, a request is malformed if the resource is not percent-encoded. So HTTP GET on:
https://example.com/.well-known/webfinger?resource=acct:user@example.com
should return 400 because the correct request is
https://example.com/.well-known/webfinger?resource=acct%3auser%40example.com
According to Section 4.1, a request is malformed if the resource is not percent-encoded.
My understanding is that percent-encoding is only required when a character is part of the URI "reserved" character set.
RFC 3986 (emphasis mine):
A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter of, or within, the component.
The "@" character, for example, is allowed so it wouldn't need to be percent-encoded.
I took the @ -> %40 and : -> %3a directly from the examples in the WebFinger RFC, assuming that if their examples encode them, there must be a reason. However, the RFC editors may have been overzealous:
According to RFC 3986 section 3 Syntax Components:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
According to section 3.4 Query:
query = *( pchar / "/" / "?" )
According to section 3.3 Path:
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
On the other hand, this RFC doesn't actually specify the ?key=value
syntax. When looking that up, I come across https://url.spec.whatwg.org/#application/x-www-form-urlencoded which makes my brain hurt.
Suggest a compromise: we 1) require that servers accept non-percent-encoded :
and @
as I cannot see how it gets in the way of interop and they do appear to be allowed and 2) permit clients to not %-encode them.
which makes my brain hurt.
Likewise. ;-)
Suggest a compromise: we 1) require that servers accept non-percent-encoded
:
and@
as I cannot see how it gets in the way of interop and they do appear to be allowed and 2) permit clients to not %-encode them.
Am I reading Section 4.1 correctly? ... that "=" and "&" are the only unreserved characters that must be percent-encoded in the query? That makes sense given they are delimiters for the query params. I think if we want to test percent-encoding we need to find or create an actor with reserved characters in its user or domain name.
EDIT: A domain name can't have an "=" or "&" so the user name would be the part that might have those characters. Out of more than 180,000 user names recorded in my Mastodon instance, none of them have those characters, but it's theoretically possible.
I removed that test now. So can we close this?
Thanks.
Can you clarify how
webfinger.server.4_2__4_do_not_accept_malformed_resource_parameters::not_percent_encoded
is working?When I print
malformed_webfinger_uri
, it looks like a valid webfinger URI. The 200 status code is what I'd expect.