jakartaee / jsonb-api

Jakarta JSON Binding
https://eclipse-ee4j.github.io/jsonb-api/
Other
78 stars 39 forks source link

Revise JsonbNumberFormat support #188

Open aguibert opened 4 years ago

aguibert commented 4 years ago

Currently the @JsonbNumberFormat annotation supports any NumberFormat that can be specified by a java.text.NumberFormat. However, all of the NumberFormats that can be specified in Java are not also valid JSON.

This came up in this Yasson issue where a user pointed out commas are used as the decimal separator in some locales, which is not valid JSON.

See RFC 7156 section 6:

6.  Numbers

   The representation of numbers is similar to that used in most
   programming languages.  A number is represented in base 10 using
   decimal digits.  It contains an integer component that may be
   prefixed with an optional minus sign, which may be followed by a
   fraction part and/or an exponent part.  Leading zeros are not
   allowed.

   A fraction part is a decimal point followed by one or more digits.

   An exponent part begins with the letter E in upper or lower case,
   which may be followed by a plus or minus sign.  The E and optional
   sign are followed by one or more digits.

   Numeric values that cannot be represented in the grammar below (such
   as Infinity and NaN) are not permitted.

      number = [ minus ] int [ frac ] [ exp ]

      decimal-point = %x2E       ; .

      digit1-9 = %x31-39         ; 1-9

      e = %x65 / %x45            ; e E

      exp = e [ minus / plus ] 1*DIGIT

      frac = decimal-point 1*DIGIT

      int = zero / ( digit1-9 *DIGIT )

      minus = %x2D               ; -

      plus = %x2B                ; +

      zero = %x30                ; 0

Proposed solution:

Require that implementations impose extra limitations on the patterns that can be specified in a NumberFormat string, namely blocking characters that are not listed in RFC 7159 section 6, such as the comma.

rmannibucau commented 4 years ago

Hi @aguibert,

Why not just requiring to respect the json grammar (something along https://www.json.org/)?

aguibert commented 4 years ago

both documents point to the same allowed grammar. Am I missing something?

rmannibucau commented 4 years ago

Yes, the spec pattern to explicitly say what is supported vs letting java specific syntax enabled through format class dependency (can't depend on the locale here for instance). So my 2cts were to phrase it as respecting the rfc you quoted rather than using a java reference.

m0mus commented 4 years ago

How Jackson and other projects like Gson and Genson and dealing with it?

Simulant87 commented 4 years ago

As the BigDecimal output is stored as a formatted String in JSON and not a Number (without quotes), maybe we don't have an issue here and the behavior is still correct, also confusing as it is depending on the system default locale when not explicitly set.

Also see my comment in the original ticket here https://github.com/eclipse-ee4j/yasson/issues/319#issuecomment-526816745

aguibert commented 4 years ago

@Simulant87 currently it is true that BigDecimal is output as String and numbers are output as non-Strings, but this difference in type based on number value also causes issues. We will be revising this in https://github.com/eclipse-ee4j/jsonb-api/issues/187 so that we always output as non-Strings

Tomas-Kraus commented 4 years ago

@aguibert Latest JSON RFC is https://tools.ietf.org/rfc/rfc8259.txt :) But grammar definition was not changed. ... I personally don't like an idea of storing JSON number as String, even BigDecimal. RFC section 6 (Numbers) does not define any limits for number size/precission - they are optional (e.g. IEEE 754) - so JSON-B spec/API can define number storage of any size/precission as JSON number. -> #187

I would expect that serialization will always produce RFC 8259 compliant JSON number and things like @JsonbNumberFormat domehow do not fit into this. But OK, if there is some strong enough bussiness requirement, things like this may happen. But they must be always explicitly configured for specific document or field.

On the other hand de-serialization may be a bit foolproof and may properly parse even some invalid formats, like , instead of . when parsing number from JSON string.

Tomas-Kraus commented 4 years ago

Also, while I'm currently working on https://github.com/eclipse-ee4j/yasson/issues/335, I was implementing JSON number deserializer there. And as a result, I would like to make some changes in the spec:

  1. update all RFC 7159 references to RFC 8259
  2. 3.3.4 java.lang.Number Serialization of java.lang.Number instances (if their more concrete type is not defined elsewhere in this chapter) to a JSON string MUST retrieve double value from java.lang.Number.doubleValue() method and convert it to a JSON Number as defined in section This is completely wrong. Big enough Long values will always lose some precission! And what about custom Number implementations? Let's imageine that someone got Complex extends Number. Why not leave it on specific type, e.g. String.valueOf(value) for primitive number types, value.toString() on instances? Also user specific implementation (e.g. Complex) would require serialization into JSON string or even JSON object and this paragraph can mention it as possibility using custom serializer/deserializer.
  3. 3.4.1 java.math.BigInteger, BigDecimal I would expect RFC 8259 compliant JSON number format for serialization as primary requirement here so it looks OK.
  4. 3.6 Untyped mapping Consider an option to deserialize integer numbers as Long and decimal numbers as Double. JSON-P parser API have method to distinguish whether current JSON number token is integer number ir not.
  5. 3.16 Big numbers Remove this section from spec. We shall serialize number types as JSON number by default.