micronaut-projects / micronaut-core

Micronaut Application Framework
http://micronaut.io
Apache License 2.0
6.09k stars 1.07k forks source link

spaces as + instead of %20 because UriBuilder uses java.net.URLEncoder.encode #2014

Open sdelamo opened 5 years ago

sdelamo commented 5 years ago

[HTML spec](https://www.w3.org/TR/html4/interact/forms.html#h-17.13.4) says that for application/x-www-form-urlencoded space characters should be replaced by +.

It seems https://tools.ietf.org/html/rfc3986 recommends to use %20 for space characters.

In io.micronaut.http.uri.DefaultUriBuilder::encode we usejava.net.URLEncoder.encodewhich replaces spaces characters with with +.

Maybe we could use add a method to UriBuilder named mediaType(MediaType mediaType).

The default value of mediaType could be APPLICATION_FORM_URLENCODED_TYPE to avoid a breaking change.

If you pass something such as MediaType.TEXT_PLAIN_TYPE we use %20 instead of +

e.g.

  private String encode(String userInfo) {
        try {
           if (mediaType == MediaType. APPLICATION_FORM_URLENCODED_TYPE) {
               return URLEncoder.encode(userInfo, StandardCharsets.UTF_8.name());
           } else {
                return URLEncoder.encode(userInfo, StandardCharsets.UTF_8.name()).replaceAll("\\+", "%20");
           }
        } catch (UnsupportedEncodingException e) {
            throw new IllegalStateException("No available charset: " + e.getMessage());
        }
    }

thoughts?

jameskleeh commented 5 years ago

What problem is this causing?

vmpn commented 6 months ago

There are servers that implement only stricter interpretation of RFC3986 only accepting %20 encoding of space and not +

While workaround is trivial of replacing+ with %20 after getting URI string, would be better if the framework supported it directly.

Thanks

rlconst commented 2 months ago

I spent a hour troubleshooting broken url because UriBuilder does not encode in path() and native URLEncoder.encode produces garbage which failed deep inside io.micronaut.http.uri.UriTemplateMatcher#rejectCharacter because it rejects +. For now I hack it with manual replacement https://stackoverflow.com/a/611117/7991462. It definitely should be handled internally