Open TimothyGu opened 3 years ago
I was implicitly assuming the 'first rule takes all', but you are right, that should be made explicit then.
I think there is another one.
Ah yes, //foo
could be parsed as (path-root /) (dir ε) (file foo), but it shouldn't.
We could resolve it through the grammar itself, which is what RFC 3986 does with its path-noscheme production. But the simplistic approach (forbidding colons in the first path segment) would forbid abc,def:123 from being parsed as a path, contrary to what browsers and the WHATWG Standard do.
That's quite sharp. Hmm. I need to have a better look at this.
Alternatively, we could just handwave it and say scheme always wins the fight. This has the advantage of keeping the grammar simple.
I think I prefer that approach.
Meanwhile there is the job of making adjustments to the grammar from RFC3987 to align with WHATWG URL, which seems so close now. (cc @masinter)
I'm stalling a bit in this area. It is a bit under constrained design-wise. I find it hard to pick one equivalent solution over the other if there's no clear reason to prefer one over the other (If I make sense), leading to indecision.
"abc:def" can be parsed in two ways:
abc
as scheme anddef
as path, andabc:def
as path.We could resolve it through the grammar itself, which is what RFC 3986 does with its path-noscheme production. But the simplistic approach (forbidding colons in the first path segment) would forbid
abc,def:123
from being parsed as a path, contrary to what browsers and the WHATWG Standard do.Alternatively, we could just handwave it and say scheme always wins the fight. This has the advantage of keeping the grammar simple.