Open domenic opened 4 years ago
The whole parameter-based format came from forms and is reflected in URLs when you use GET. Where does the parameter-based format that does not originate from forms come from?
@annevk The non form format comes from APIs and application level routers.
Where does the parameter-based format that does not originate from forms come from?
Are you asking where the rules in https://url.spec.whatwg.org/#query-state came from? I mean, you wrote them down :). I'd presume from one of the RFCs.
I'm saying that never had an official key-value format. As far as I know application/x-www-form-urlencoded is the only thing that does something with &
and =
. I think there might be some servers that also do something with ;
though, but not sure how official that ever was.
I'm experimenting with a new query params interface for my URL library, and I figured some of the ideas might be useful for future web APIs.
The model I'm going with is that, rather than "query parameters", this is modelled as a key-value string within an opaque URL component. A handful of components don't have any defined internal structure - that includes the query, but also the fragment, and we also have opaque hosts and paths. Technically, you could encode a key-value string in any of them.
Media fragments are an example of key-value pairs within the fragment:
http://www.example.com/example.ogv#track=audio&t=10,20
As is OAuth 2.0:
If the resource owner grants the access request, the authorization
server issues an access token and delivers it to the client by adding
the following parameters to the fragment component of the redirection
URI using the "application/x-www-form-urlencoded" format
http://example.com/cb#access_token=2YotnFZFEjr1zCsicMWpAA&state=xyz&token_type=example&expires_in=3600
I also found an App called FoxyProxy (😅) which allows issuing commands via key-value pairs in a URL with opaque path. For example:
proxy:host=foo.com&port=999&action=add
So I think there is a general problem: these opaque URL components exist so that developers can encode custom structured data in their own identifiers. Key-value strings are just one example; comma-separated lists are another kind of structure that could be better supported. There really aren't great APIs for reading and manipulating these kinds of things in general - there's one kind of key-value string in one component which does have great, convenient APIs, and everything else is sort of forgotten about. People tend to hack stuff together to make up for it, and it's quite awkward and easy to make mistakes.
In Swift (the language my library is for), I'm able to define a schema object which allows customising which characters are interpreted as delimiters, which are written as delimiters, as well as options for escaping. Users can then do something like this:
var url = WebURL("http://example.com")!
// 'commaSeparated' is a user-defined schema object.
// Even though it is a custom KVP schema operating in the fragment,
// the API generalises so it is still super-easy to read/write key-value pairs.
url.withMutableKeyValuePairs(in: .fragment, schema: .commaSeparated) { kvps in
kvps += [
"foo": "bar",
"baz": "qux"
]
}
print(url) // "http://example.com#foo:bar,baz:qux"
// ^^^^^^^^^^^^^^^^
// You can also just get a view object, which doesn't need any awkward nesting.
// Again, generalises so it is really easy to use even for non-query params.
let kvps = url.keyValuePairs(in: .fragment, schema: .commaSeparated)
kvps["foo"] // "bar"
The goal of an API like this which can scale to different use-cases is that it allows people to do more advanced things with URLs, more easily. It could also be an idea for the web, and the various places web technologies are used.
I also quite like that it dilutes the idea of "query parameters" as being some kind of special, proper URL component - they end up being just one expression of a general ability to encode opaque data.
While in some sense appealing, I think there's real value in not generalizing delimiters as it makes it easier to interoperate across disparate endpoints.
I agree; &
and =
are the default delimiters, and the vast majority of developers will never need to change them (or create a custom schema at all; they can use the built-in ones for form-encoding or percent-encoding). But I do think it has some small amount of value - ;
(semicolon) can be used sometimes as a delimiter between pairs, and I've seen applications which try all sorts of more exotic things. The spotify:
URL scheme actually uses the same delimiter between keys and values as it does between key-value pairs!
spotify:user:<username>:playlist:<id>
spotify:search:<text>
The NodeJS querystring
API also has parse
and stringify
methods that allow specifying a custom delimiter.
If you did need to parse/create an existing URL format which uses a different delimiter, it is not entirely trivial to do, and I think it's the kind of thing a URL API could and probably should help you do correctly. Even if we also advise that most users stick with the default.
Problem
See background in #18 and #478.
URLSearchParams
was designed, not to hold URL query data, but instead to holdapplication/x-www-form-urlencoded
data, i.e. the data that is sent to a server when submitting a HTML<form>
.Unfortunately, it was misnamed
URLSearchParams
instead ofApplicationXWWWFormURLEncodedParams
. And, even more unfortunately, a property namedsearchParams
was added to theURL
class, which is an instance of theURLSearchParams
class. Any attempts to use thesearchParams
class will give misleading information about the URL. And any attempts to manipulate it will change the contents of your URL's query string in unintended ways, converting values from a query string serialization (of the type produced by the URL parser) into anapplication/x-www-form-urlencoded
serialization.Some examples of how
url.searchParams
does not allow faithful introspection into the URL record:Some examples of how using
url.searchParams
for mutation will cause unintended changes to your URL record:Solution
In https://github.com/whatwg/url/issues/478#issuecomment-620929779 I proposed four solutions to this problem. In response, @ricea (Chromium) and @achristensen07 (WebKit) indicated they were "in favor of maintaining the status quo". I interpret this as meaning that any changes to either the URL query string parser/serializer, or the
application/x-www-form-urlencoded
parser/serializer, or theURLSearchParams
class andurl.searchParams
member, are not on the table.Given these constraints, it seems the only thing we could do is propose a new non-breaking addition to the API. As such, I propose a
URLQueryParams
class and a correspondingurl.queryParams
member, which are identical toURLSearchParams
andurl.searchParams
, except that they use the URL parsing/serialization rules instead of theapplication/x-www-form-urlencoded
rules. (Alternate names includeurl.realSearchParams
orurl.searchParams2
.)With that added, we could effectively deprecate
url.searchParams
(i.e., state loudly in the spec and MDN that using it will give unreliable results and mess up your URLs), and note thatURLSearchParams
is useful for representing<form>
serialization, but not useful for manipulating URL search parameters.(Optionally, we might want to define
url.query
/location.query
/workerLocation.query
as aliases for the corresponding.search
properties, to fully align on the "query" naming and obsolete the "search" naming. But that's separable.)