Open lu-zero opened 1 year ago
We cannot change the existing API, but I'm somewhat supportive of adding API surface for this as it is indeed hidden information. For search
too. (hasSearch
& hasHash
seem more palatable.)
Having said that, is there evidence on Stack Overflow or in popular JS libraries that this is a shortcoming people have to work around?
I found the problem while looking at how the url fragment is supported across languages while working at another standard, so I cannot tell you how widespread this need is within JS, I guess we'll have to make a note and signal the pitfall.
What is surprising me even more is that you do not get what you set.
let url = new URL("scheme://host/path/");
console.log(url.hash);
url.hash = "#";
console.log(url.toString()); // -> scheme://host/path/#
console.log(url.hash); // -> ''
url.hash = "#a";
console.log(url.toString()); // -> scheme://host/path/#a
console.log(url.hash); // -> '#a'
I agree that this part of the JS URL API is awkward. To give another data point: in my library WebURL, which implements the WHATWG standard in Swift, I made this change ("not present" is communicated as nil
, not as an empty string) and some other tweaks.
WebURL uses nil to signal that a value is not present, rather than an empty string. This is a more accurate description of components which keep their delimiter even when empty. For example, consider the following URLs:
http://example.com/ http://example.com/?
According to the URL Standard, these URLs are different; however, JavaScript’s
search
property returns an empty string for both. In fact, these URLs return identical values for every component in JS, and yet still the overall URLs compare as not equal to each other. This has some subtle secondary effects, such asurl.search = url.search
potentially changing the URL.WebURL avoids this by saying that the first URL has a
nil
query (to mean “not present”), and the latter has an empty query. This has the nice property that every unique URL has a unique combination of URL components.
I appreciate that the JS API cannot be changed at this point, though.
Host has this problem too.
sc:///foo
from sc:/foo
, nor can you distinguish sc:
from sc://
by inspecting the properties of their corresponding URL objects (other than the href itself).There is this classic post according to which query and fragment have been in use fairly consistently to refer to the search without the ?
sigil and the hash without the #
sigil.
So one option is to fix search and hash and make them available as query and fragment instead. The search and hash getters / setters can then be marked as legacy or deprecated (but not removed).
Having said that, is there evidence on Stack Overflow or in popular JS libraries that this is a shortcoming people have to work around?
I've run into this problem myself, in multiple projects and libraries, in both Node & browsers.
Right now I'm building developer tools, where URLs are taken as string input, parsed, and manipulated by component, and preserving the raw formatting where possible is useful. Not being able to differentiate between /?
and /
and the end of a URL is quite inconvenient! I'm still using Node's url.parse
in some places in part because it does not have this behaviour and that's important.
Of course this state does exist within the URL parser (the URL's internal query and fragment states in the spec do store empty & null differently) but it's just not currently exposed the same way in search
& hash
(in both cases, both null and empty are exposed as ''
).
Totally understand that changing the existing API is impractical. Either of the options proposed here so far would work well in scenarios like mine:
hasSearch
and hasHash
booleans to distinguish no-delimiter vs delimiter-but-empty-value (or has{Search,Hash}Delimiter
, if we want to be even more explicit)query
& fragment
fields that do always include the delimiter as it was originally parsed, so they're set even if the value itself is emptyThe latter is definitely more convenient as a user (fullPath = url.pathname + url.query + url.fragment
would effectively reproduce the original relative url components - which it does not do today!) but both are workable, and the confusion of two very similar fields with almost always identical values might not be worthwhile.
and
if fed to the
URL
do not distinguish between the two: URL.hash returns''
and to make it even stranger passing
.hash = '#'
producesscheme://host:port/#
but calling.hash
returns''
nonetheless.would be nicer if
.hash
returnsundefined
/null
if it is unset or"#"
if the trailing hash is present.