Open whesse opened 11 years ago
I think Bill is right.
However, I'll point out that it is a very bad idea to use RFC 2396 to make this point. RFC 2396 is obsoleted by RFC 3986. In other words, RFC 2396 is obsolete and has be replaced by RFC 3986.
We should go ahead and make this change if we are 100% sure that RFC 3986 only mandates a scheme for a absolute URI. The documentation for isAbsolute should describe exactly what is means.
Can we remove isAbsolute? I'm not sure it is useful.
Yes, I think we should remove it. It is just scheme == ''.
And isAbsolute is misleading, because it is unclear for what it should return for the relative URI (called a "relative reference" in the RFC) "/dir/file". Currently, it returns false.
I'm curious about what we think it should return for a "relative reference" starting with a slash - which is a legitimate relative reference that has an "absolute" path, but no scheme.
The resolution of it replaces the base path completely, but not the target scheme. The RFC says: if (R.path starts-with "/") then T.path = remove_dot_segments(R.path); else T.path = merge(Base.path, R.path); T.path = remove_dot_segments(T.path);
Added Area-Library label.
We will add isRelative and update the documentation to refer to the spec.
The problem is still that the isAbsolute checking for fragment is not necessarily what users expect.
Set owner to @sgjesse. Added this to the 1.5 milestone. Removed Type-Defect label. Added Type-Enhancement label.
I dug into this w/ https://code.google.com/p/dart/issues/detail?id=18053
Per RFC, URIs with fragments http://tools.ietf.org/html/rfc3986#section-4.3 are generally not considered absolute.
I believe this IS the case for http, which likely the vast majority use case of Uri.
I'd love to see isRelative added to Uri as an alternative.
We should not change the semantics of isAbsolute before v2.0.
Issue #18053 has been merged into this issue.
Issue #18053 has been merged into this issue.
I think the best solution is to add isRelativePath and isRelativeRef. In the long run, it would be nice to rename isAbsolute to isAbsoluteUri, which is the name Haskell uses for the same thing. In any case, we should document isAbsolute to only accept "absolute URIs", without a fragment.
Relative path and relative reference are unambiguously defined in section 4.2 of RFC 3986, and are exactly what most people want. Relative reference is something that doesn't begin with a scheme, and relative path is something that doesn't begin with a scheme or a slash. Both of these can have fragment identifiers (and query strings).
We could file a new bug for adding these, or just consider this bug resolved if they are added.
Looking at other languages, they all do different things. Some of their "absolute" queries allow a fragment, others don't. I think Haskell may be the closest to the spec - it uses isAbsoluteURI for the semantics of our current isAbsolute.
I would like to see isAbsolute renamed to isAbsoluteURI in 2.0, but I agree that the best solution now would be to add an isRelative or isRelativeRef, which is
Test if string contains a valid URI (an absolute URI with optional fragment identifier).
isURIReference :: String -> BoolSource
Test if string contains a valid URI reference (an absolute or relative URI with optional fragment identifier).
isRelativeReference :: String -> BoolSource
Test if string contains a valid relative URI (a relative URI with optional fragment identifier).
isAbsoluteURI :: String -> BoolSource
Test if string contains a valid absolute URI (an absolute URI without a fragment identifier).
Issue #18053 has been merged into this issue.
So we either retitle this issue or close as WontFix and open up an enhancement request for the changes.
Removed this from the 1.6 milestone. Added Oldschool-Milestone-1.6 label.
Removed Oldschool-Milestone-1.6 label.
A Uri returns true for isAbsolute only if it has a non-empty scheme and it has no fragment identifier.
I think this is a misreading of the spec. There is a grammar for "Absolute URI" that has no fragment identifier, but the grammar for "Relative URI" also has no fragment identifier.
I think isAbsolute should just return true if there is a scheme, and false if there isn't.
The spec calls URIs with a scheme and no fragment "Absolute URIs", URIs with a scheme and a fragment "Absolute URI references", URIs with no scheme and no fragment "Relative URIs", and URIs with no scheme and a fragment "Relative URI references". So strictly speaking, if it has a fragment, it isn't a URI. But we include fragments in our URI class, and it really has nothing to do with whether it is relative or absolute. Here is the section, from rfc 2396: (quotes from rfc 3986 are further down)
URI References
The term "URI-reference" is used here to denote the common usage of a resource identifier. A URI reference may be absolute or relative, and may have additional information attached in the form of a fragment identifier. However, "the URI" that results from such a reference includes only the absolute URI after the fragment identifier (if any) is removed and after any relative URI is resolved to its absolute form. Although it is possible to limit the discussion of URI syntax and semantics to that of the absolute result, most usage of URI is within general URI references, and it is impossible to obtain the URI from such a reference without also parsing the fragment and resolving the relative form.
URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
The syntax for relative URI is a shortened form of that for absolute URI, where some prefix of the URI is missing and certain path components ("." and "..") have a special meaning when, and only when, interpreting a relative path. The relative URI syntax is defined in Section 5.
4.1. Fragment Identifier
When a URI reference is used to perform a retrieval action on the identified resource, the optional fragment identifier, separated from the URI by a crosshatch ("#") character, consists of additional reference information to be interpreted by the user agent after the retrieval action has been successfully completed. As such, it is not part of a URI, but is often used in conjunction with a URI.
In rfc 3986, the term URI actually means absolute URI, with a scheme and a possible fragment, and relative URIs are called "relative references" only:
URI-reference is used to denote the most common usage of a resource identifier.
URI-reference = URI / relative-ref
A URI-reference is either a URI or a relative reference. If the URI-reference's prefix does not match the syntax of a scheme followed by its colon separator, then the URI-reference is a relative reference.
The section 4.1 labeled absolute URI actually defines a grammar for the URI without the fragment, and makes the point that schemes should define a grammar matching this, and not include the fragment as part of the scheme-specific grammar for the URI, because the fragment at the end is not-scheme-specific. But example URIs for that scheme should include cases with a fragment.
Another example, where they separated the issue of absolute form and fragment identifiers, is in "If the base URI is obtained from a URI reference, then that reference must be converted to absolute form and stripped of any fragment component prior to its use as a base URI."
So conversion to absolute form does not include the stripping of the fragment - that must be added.