spring-projects / spring-framework

Spring Framework
https://spring.io/projects/spring-framework
Apache License 2.0
56.24k stars 37.98k forks source link

Add javadoc to org.springframework.web.util.UrlParser to indicate that it should only be used with modern browsers, not anything else #33542

Open joakime opened 4 days ago

joakime commented 4 days ago

Affects: Any with UrlParser

The recently added org.springframework.web.util.UrlParser is not spec compliant outside of the limited scope of modern browsers.

The living URL document at whatwg is incompatible with the IETF URI spec, Java itself, the Servlet spec, and various other non-browser use cases.

Users that want to use the new UrlParser should not be using it for non-browser use cases (eg: REST clients).

bclozel commented 4 days ago

Thanks for this feedback @joakime

Could you elaborate on the limitations of using this parser for REST clients? What kind of valid URLs wouldn't parsed correctly with this parser in the context of REST clients?

Thanks!

joakime commented 4 days ago

Take for example a URL with no heir-part, or no authority.

http:///path.png

The UrlParser sees that as a host with path.png

Here's what java.net.URI does ...

$ jshell
|  Welcome to JShell -- Version 17.0.11
|  For an introduction type: /help intro

jshell> var uu = new URI("http:///path.png")
uu ==> http:///path.png

jshell> uu.getHost()
$2 ==> null

jshell> uu.getPath()
$3 ==> "/path.png"

Here's what java.net.URL does ...

$ jshell
|  Welcome to JShell -- Version 17.0.11
|  For an introduction type: /help intro

jshell> var uu = new URL("http:///path.png")
uu ==> http:/path.png

jshell> uu.getHost()
$2 ==> ""

jshell> uu.getPath()
$3 ==> "/path.png"

Those built-in parsers, along with the existing URL / URI parsers in various Servlet libraries follow the spec in RFC3986, which parses that as a URL with no authority, just a path.

WhatWG is good, if you want to follow browser behaviors, but not good outside of that limited scope.

There are more examples than just this, but just know that WhatWG isn't a great choice for the general internet behaviors, non-browser clients, http hardware, security tooling, caching servers, load balancers, etc ...

joakime commented 4 days ago

There are many decisions in WhatWG that are designed to "clean up" bad behaviors from users, typos and whatnot (eg: eliminate duplicates, normalize away extra slashes, eliminate whitespace, etc)

Those are great choices for a browser dealing with HTML and Javascript, but not appropriate outside of a browser.