Open joakime opened 4 days ago
Thanks for this feedback @joakime
Could you elaborate on the limitations of using this parser for REST clients? What kind of valid URLs wouldn't parsed correctly with this parser in the context of REST clients?
Thanks!
Take for example a URL with no heir-part, or no authority.
http:///path.png
The UrlParser sees that as a host with path.png
Here's what java.net.URI does ...
$ jshell
| Welcome to JShell -- Version 17.0.11
| For an introduction type: /help intro
jshell> var uu = new URI("http:///path.png")
uu ==> http:///path.png
jshell> uu.getHost()
$2 ==> null
jshell> uu.getPath()
$3 ==> "/path.png"
Here's what java.net.URL does ...
$ jshell
| Welcome to JShell -- Version 17.0.11
| For an introduction type: /help intro
jshell> var uu = new URL("http:///path.png")
uu ==> http:/path.png
jshell> uu.getHost()
$2 ==> ""
jshell> uu.getPath()
$3 ==> "/path.png"
Those built-in parsers, along with the existing URL / URI parsers in various Servlet libraries follow the spec in RFC3986, which parses that as a URL with no authority, just a path.
WhatWG is good, if you want to follow browser behaviors, but not good outside of that limited scope.
There are more examples than just this, but just know that WhatWG isn't a great choice for the general internet behaviors, non-browser clients, http hardware, security tooling, caching servers, load balancers, etc ...
There are many decisions in WhatWG that are designed to "clean up" bad behaviors from users, typos and whatnot (eg: eliminate duplicates, normalize away extra slashes, eliminate whitespace, etc)
Those are great choices for a browser dealing with HTML and Javascript, but not appropriate outside of a browser.
Affects: Any with UrlParser
The recently added
org.springframework.web.util.UrlParser
is not spec compliant outside of the limited scope of modern browsers.The living URL document at whatwg is incompatible with the IETF URI spec, Java itself, the Servlet spec, and various other non-browser use cases.
Users that want to use the new
UrlParser
should not be using it for non-browser use cases (eg: REST clients).