Balzanka / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Presumably bad media type for JSON #915

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The media type for JSON is defined like this:

    public static final MediaType JSON_UTF_8 = new MediaType(APPLICATION_TYPE, "json")
        .withCharset(UTF_8);

I.e. "application/json; charset=utf-8".

However, while JSON is a text format, it is part of the "application" type 
group, while the "charset" parameter is (AFAIK) only applied to text types.

RFC 4627 says this in "6. IANA Considerations":

   The MIME media type for JSON text is application/json.

   Type name: application

   Subtype name: json

   Required parameters: n/a

   Optional parameters: n/a

   Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32

      JSON may be represented using UTF-8, UTF-16, or UTF-32.  When JSON
      is written in UTF-8, JSON is 8bit compatible.  When JSON is
      written in UTF-16 or UTF-32, the binary content-transfer-encoding
      must be used.

Ergo, the charset parameter must be dropped from the constant.

The same might apply to "application/javascript" ("text/javascript" exists, but 
is considered obsolete), though I didn't check that.

Original issue reported on code.google.com by j...@nwsnet.de on 1 Mar 2012 at 5:04

GoogleCodeExporter commented 9 years ago
I disagree with your assessment that it _must_ be dropped.  RFC 2046 states 
that "other media types than subtypes of "text" might choose to employ the 
charset parameter as defined here," which indicates that there is no 
restriction on the presence of the charset parameter on application types.  
Additionally, RFC 2045 states that "MIME implementations must ignore any 
parameters whose names they do not recognize."  So, it is not reasonable to 
assume that there is any harm being done by its presence.

The charset parameter is there because browsers that attempt to sniff the 
charset when its not present are vulnerable to certain types of exploits.  So, 
we have defaulted to adding the charset to any media type that is likely to be 
served to and interpreted by a browser.  Without any evidence that this is 
actually incompatible with existing code/services, I'm going to leave it alone.

Finally, if you truly do need that media type without the parameter, the 
withoutParameters() method should do the trick.

Original comment by gak@google.com on 2 Mar 2012 at 5:13

GoogleCodeExporter commented 9 years ago
OK, your justification for adding the charset seems real-worldy enough for me 
after I found http://bugs.cometd.org/browse/COMETD-55 (via a similar issue I 
reported for i-jetty at http://code.google.com/p/i-jetty/issues/detail?id=52).

Still, I believe that implementations should be JSON-aware and use the correct 
default charset (UTF-8) instead of a probably global one.

Original comment by j...@nwsnet.de on 5 Mar 2012 at 9:30

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:14

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:08