Closed SimonSapin closed 9 years ago
The "URL writing" section describes how you write a component. It wouldn't make sense to refer to a data structure there, since there isn't any yet. It's about the eventual input to the URL parser.
If for the purpose of that section "a URL’s path" is a different concept than in the rest of the spec, it should not link to #concept-url-path.
I guess that would require a whole set of fresh identifiers then... Since none of them are model components... They're all syntax components. Meh.
Another option is to have the path be a single string everywhere. Path components are not actually used outside the parser as far as I know, and could still be obtained by splitting on /
.
That would not solve this problem. E.g. IPv4 address is a 32-bit integer, but that's not how you write it. If we make port a 16-bit integer, it likewise doesn't represent syntax. And even port being a string you could argue that the syntax thing is different since it can have leading 0s and such.
There is a "Host writing" section that describes how to represent an IPv4 address as a string with some .
s. Should there be a similar section (or just a sentence) for an URL’s path with /
s?
(Looking a bit more at the spec…) Namely, I think this sentence:
A path must be zero or more URL units, excluding "?".
should mention path components and slashes.
Since fixing this is not happening today this is what I want to do when I get back to this, hopefully soon:
Consider whether or not Windows drive letters need to be a parse error or part of the URL syntax section. Likely the former? Although that kind of obsoletes Windows from the perspective of the specification...
Yes, let's not do the former, please. Remember that UAs are used ~95% of the time on Windows, even if developers prefer other OSs.
The planned changes outlined in https://github.com/whatwg/url/issues/33#issuecomment-131400647 look great to me. The one other somewhat-related thing I’m still hoping for are normative requirements for what code points are allowed in a domain, as raised at https://www.w3.org/Bugs/Public/show_bug.cgi?id=25334
@domenic I kind of wish we could fade out file URLs entirely. But I guess they still have legitimate use in node.js (or Node.js?) development?
It's a bit tricky too to define the syntax constructs for them since it heavily depends on the base URL, but I'll try to figure something out.
https://www.ietf.org/mail-archive/web/apps-discuss/current/msg14575.html
a proposed updated IETF spec for 'file:' URI scheme, check it out. r
@masinter we did, see https://github.com/w3ctag/spec-reviews/issues/59.
They have legit uses in pretty much any system which deals with both files and URLs, yeah. Getting them documented and nailed down would be very helpful, especially if the URL Standard wants to be more than just the standard for browsers, but instead the standard for anything that interoperates with browsers.
I've decided to address railroad diagrams separately. See #67.
Sounds good.
(Just to name the things in the list, this could be "[…] a list of zero or more <a>path components</a> holding […] A <dfn>path component</dfn> is an ASCII string.")
Here, it looks like a path is a single string that is concatenated with other strings. "a path" here probably should be something like "a path as components separated with
/
." Also, should there be an initial/
before the first component?Same here. Are components separated by
/
? What does it mean for a list of string to start with "/", is that the value of the first component?URL units being code points, this sounds like a path is a single string.
… and a list of strings again. (Same in various places in the parser.)