URI is used extensively in the API, as argument and return value, in the
implementation as state, and in processors and resolvers.
It shouldn't be. The problem is that java.net.URI, the actual signature of the
arguments/returns/state requires conformance to the URI specification, RFC
2396. Unfortunately, there are a large number of (mostly, but not entirely)
private-use URLs that violate 2396, and even violate the earlier 1738. These
aren't unresolvable, though, if a URL scheme handler has been written for them.
This came up with respect to the OSGi/eclipse 'bundleentry' scheme, for which
Eclipse provides a URL scheme handler in core. There is no specification of the
URL/URI syntax, but it appears, from inspection, that it looks like this:
bundleentry://bundleid/path/to/resource
This is, arguably, valid, treating the bundleid as an authority. Unfortunately,
due to the nature of the bundleid (nn.cccnnnnnnn, or something like that), it
looks like a path or a malformed ip address. In any event, it causes URI to
throw URISyntaxException, which means ... we have no way of falling back to
trying to resolve this, because if we can't make it into a URI, then we can't
do anything at all with it.
It's perfectly feasible to write a URL scheme handler that can return a
resource for utterly opaque strings (and it used to be possible to install it
as the default handler, so even a missing scheme could get resolved ... stupid
scheme handler tricks) that violate the syntax of 1738, 2369, or any other
imaginable specification.
If a (simply) malformed URI is the documentURI, bulletproofing kicks in, not to
preserve information, but to discard it. If it's not a URI, then we just throw
it away.
Note that we *don't* do this for namespaces, because (as everyone involved with
XML is aware), the relevant sentence in the namespaces in XML specification
should read "A namespace is NOT a URI."
So: we need to go through and turn pretty nearly every instance of 'URI' as an
argument, return value, or state into String, which has the advantage of
matching the general character of the referenced bits.
Original issue reported on code.google.com by aale...@gmail.com on 26 Aug 2013 at 7:29
Original issue reported on code.google.com by
aale...@gmail.com
on 26 Aug 2013 at 7:29