Instant precision should be retained in ObjectMapper#readTree (add `JsonNodeFeature` to force use of `BigDecimal` for `JsonNode` reads)

FasterXML / jackson-modules-java8

Set of support modules for Java 8 datatypes (Optionals, date/time) and features (parameter names)

Apache License 2.0

398 stars 116 forks source link

Instant precision should be retained in ObjectMapper#readTree (add `JsonNodeFeature` to force use of `BigDecimal` for `JsonNode` reads) #307

Open jzheaux opened 3 years ago

jzheaux commented 3 years ago

Describe the bug

ObjectMapper#readValue(String) seems to preserve all nine digits of an Instant's decimal, but ObjectMapper#readTree(JsonParser) does not.

Version information

Jackson 2.12.1

To Reproduce

ObjectMapper mapper = new ObjectMapper();
{
    mapper.registerModule(new JavaTimeModule());
}

Instant issuedAt = Instant.ofEpochSecond(1234567890).plusNanos(123456789);
Map<String, Instant> attributes = new HashMap<>();
{
    attributes.put("iat", issuedAt);
}

@Test
void passes() throws Exception {
    String serialized = mapper.writeValueAsString(attributes);
    Map<String, Instant> deserialized = mapper.readValue(serialized, new TypeReference<>() {});
    assertThat(deserialized.get("iat")).isEqualTo(issuedAt);
}

@Test
void fails() throws Exception {
    String serialized = mapper.writeValueAsString(attributes);
    JsonParser parser = mapper.createParser(serialized);
    JsonNode mapNode = mapper.readTree(parser);
    Map<String, Object> deserialized = new LinkedHashMap<>();
    Iterable<Map.Entry<String, JsonNode>> fields = mapNode::fields;
    for (Map.Entry<String, JsonNode> field : fields) {
        deserialized.put(field.getKey(), mapper.readValue(field.getValue().traverse(mapper), Instant.class));
    }
    assertThat(deserialized.get("iat")).isEqualTo(issuedAt);
}

The second test fails with:

Expecting:
 <2009-02-13T23:31:30.123456700Z>
to be equal to:
 <2009-02-13T23:31:30.123456789Z>
but was not.
Expected :2009-02-13T23:31:30.123456789Z
Actual   :2009-02-13T23:31:30.123456700Z

Additional context

The failure case is derived from a custom deserializer in our project.

Ideally, I'd like readTree to retain the Instant's precision in the same way that ObjectMapper#readValue does when it deserializes a Map that contains Instants, though I'd be happy with advice on how to change my approach in my deserializer instead.

I realize also that USE_BIG_DECIMAL_FOR_FLOATS addresses this issue, but I'd prefer to avoid that, since others import our module.

cowtowncoder commented 3 years ago

Right, Instant is not directly related to the issue here; the problem is that JsonNode defaults to using Double which being 64-binary floating-point number can not represent values with same fidelity. There is no other option than either forcing BigDecimal or somehow make Jackson postpone string-to-number conversion; I think there is another issue for trying to make that happen but no clear way for how that could work -- partly since some formats (most binary formats) will have specific backing number type, and string-to-number conversion is needed for textual formats.

But this is neither available now nor absolutely needed for your use case. You should probably define and use an intermediate POJO for values and make property you need to be of type BigDecimal (unless you want to use JsonParser directly). Either way you can avoid conversion to double and related loss of precision.

jzheaux commented 3 years ago

Since the defined objects contain Map<String, Object> by design, it's not an option to develop a POJO as a deserialization hint.

However, it appears that we may be able to copy the unmodifiable Map to a LinkedHashMap, and then allow ObjectMapper#readValue to do its magic. Or, we could simply change our documentation to state that our Jackson module only supports microsecond precision by default. I've created https://github.com/spring-projects/spring-security/issues/9460 to take a closer look.

cowtowncoder commented 3 years ago

Ok. Use of Map<String, Instant> would do the trick too, but if stronger typing upfront is not possible that won't help much. I do think that in general ability to defer decoding of floating-point values for textual formats, when buffering, is valuable. For 2.13 there is a related issue, FasterXML/jackson-databind#2989, which would make this a bit easier to achieve. Binary formats could have simpler functionality in which deferral/coercion is not needed, but most textual formats would simply buffer textual representation and handle decoding on as-requested basis when "reading" from TokenBuffer using numeric accessors.

Logic-32 commented 4 months ago

Just want to chime in and add that, as of 2.17, this is an issue someone I work with had to find the hard way :(

Out of curiosity, has any testing been done to approximate how much of a performance impact USE_BIG_DECIMAL_FOR_FLOATS has? Since ObjectMapper seems to effectively default to having it on, I can't imagine the impact would be significant. I'm inclined to globally turn it on (within my company) and am mostly looking for an excuse to not do so.

Not that it's worth much, but we made another test (basically identical to the above) to validate:

@Test
public void testInstantsAsDouble() throws IOException {
    ObjectMapper mapper = new ObjectMapper();
    mapper.registerModule(new JavaTimeModule());

    String testJson = "{\"instant\": 1196676934.567891234}";

    // Use ObjectMapper directly
    TestClass result1 = mapper.readValue(testJson, TestClass.class);

    // Use JsonNode as an intermediary
    JsonParser p = mapper.getFactory().createParser(testJson);
    TestClass result2 = p.readValueAsTree().traverse(mapper).readValueAs(TestClass.class);

    assertEquals(result1.getInstant(), result2.getInstant()); // Will fail
}

@Test
public void testInstantsAsBigDecimal() throws IOException {
    ObjectMapper mapper = new ObjectMapper();
    mapper.registerModule(new JavaTimeModule());
    mapper.enable(DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS);

    String testJson = "{\"instant\": 1196676934.567891234}";

    // Use ObjectMapper directly
    TestClass result1 = mapper.readValue(testJson, TestClass.class);

    // Use JsonNode as an intermediary
    JsonParser p = mapper.getFactory().createParser(testJson);
    TestClass result2 = p.readValueAsTree().traverse(mapper).readValueAs(TestClass.class);

    assertEquals(result1.getInstant(), result2.getInstant()); // Will pass
}

private static class TestClass {
    private Instant instant;

    public Instant getInstant() {
        return instant;
    }

    public void setInstant(Instant instant) {
        this.instant = instant;
    }
}

cowtowncoder commented 4 months ago

Oh, this is in wrong repo; Instant handling belongs to jackson-modules-java8. Even if BigDecimal buffering is in jackson-databind. Will move.

@Logic-32 I am not sure why you think BigDecimal is used by default -- it is not; only that feature will force doing that. But we do try to retain accurate value via buffering so if accessed as BigDecimal, value should be more accurate than with Double (where precision limited for 64-bit).

Logic-32 commented 4 months ago

Sorry. Perhaps "default" was the wrong word. What I mean is:

On the happy-path that works as expected, the code goes through: https://github.com/FasterXML/jackson-core/blob/8b87cc1a96f649a7e7872c5baa8cf97909cabf6b/src/main/java/com/fasterxml/jackson/core/base/ParserBase.java#L863
- This path explicitly uses BigDecimal throughout the whole process and allows for the retention of nanosecond precision.
- On the sad-path, using JsonNode, you go through: https://github.com/FasterXML/jackson-databind/blob/db623a8a85834f59f5b6331435b80150050ef841/src/main/java/com/fasterxml/jackson/databind/node/TreeTraversingParser.java#L290
- This path allegedly uses a BigDecimal too. However, at this point, the default behavior is to have a DoubleNode which loses the nanosecond precision.

So, what I mean is that ParserBase has the correct behavior during parsing without changing any settings. But, the code that parses a JsonNode requries the extra setting to operate as expected.

With that said, I kinda understand why you moved this case to jackson-modules-java8 however, I'm curious what can be done in/around the InstantDeserializer to allow JsonNode to retrain the correct precision without the USE_BIG_DECIMAL_FOR_FLOATS feature?

cowtowncoder commented 4 months ago

Unfortunately I don't see another solution, aside possibly adding separate JsonNodeFeature to allow forcing of BigDecimal just for JsonNode and not generally.

The issue is that although in theory we could change handling of JsonNode to use "deferred" number value (accessible via JsonParser.getNumberValueDeferred()), JsonNode subtypes are strongly typed with eager values -- and changing that could have major compatibility consequences. This means that when constructing a JsonNode based tree, physical number type will be fixed, and there is no way recover (by re-parsing) from original textual number representation into more precise type, if DoubleNode has been created.

So the question is whether processing itself absolutely requires JsonNode as intermediate type. Then again, binding into Object faces similar problem: the only structure that retains deferred type is TokenBuffer used internally for buffering. It is available externally, but is not very convenient for any actual processing, unlike JsonNode.

Logic-32 commented 4 months ago

Unfortunately I don't see another solution, aside possibly adding separate JsonNodeFeature to allow forcing of BigDecimal just for JsonNode and not generally.

I'm not immediately opposed to this idea. In general, I'd be surprised if anyone expected JsonNode to exhibit different parsing results from what ParserBase offers. Meaning, if this is a feature, I'd like to see it enabled by default so the decision about precision can be made as late as possible.

So the question is whether processing itself absolutely requires JsonNode as intermediate type.

In our case, we absolutely do not need JsonNode. In fact, I lobby against using JsonNode in almost all cases. It has a place but people lean on it too hard imo. With that said, I also can't justify the cost to refactor our current, legacy code to not use it :(

cowtowncoder commented 4 months ago

Yeah the idea of adding the new feature would specifically be to allow forcing BigDecimal only for JsonNode, not other processing (for Number or Object, specifically) -- that is, can leave DeserializationFeature.USE_BIG_DECIMAL_FOR_FLOATS) disabled.

Logic-32 commented 4 months ago

I'm not sure if that'll leave some other corner cases around but am not immediately opposed to it :) Afterall, my arguments against JsonNode in general is that it is not an efficient approach to deserialization. May as well at least make it "accurate" ;)

cowtowncoder commented 4 months ago

JsonNode is interesting: for some use case it is pretty performant (when it is the main abstraction used, and not used as just source for converting to something else, i.e. unnecessary transformations). But it is also easy to overuse.

Be that as it may, the intent with possible addition would be to give more granular configurability -- I do not think there is a way to automatically prevent the problem as per my earlier explanation & backwards compatibility constraints.

cowtowncoder commented 1 month ago

No real update, other than mention the possibility of adding a new JsonNodeFeature.

Also, adding a (failing) test based on code here, PR for, would be something useful.

cowtowncoder commented 1 week ago

Since no one else did it, I added test (in bit simplified form), in case someone has time to dig into this in future (but as noted, can't be fixed by this module alone).