FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
310 stars 133 forks source link

Unable to deserialize Object with unknown `Timestamp` field #251

Closed mgoertzen closed 8 months ago

mgoertzen commented 3 years ago

Repro code:

import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.dataformat.ion.IonObjectMapper;
import com.fasterxml.jackson.dataformat.ion.ionvalue.IonValueModule;

import java.io.IOException;

public class Test {

    private static class Message {
        private final String message;
        private final Integer count;

        @JsonCreator
        public Message(@JsonProperty("message") String message,
                       @JsonProperty("count") Integer count) {
            this.message = message;
            this.count = count;
        }

        public String getMessage() {
            return message;
        }
    }

    public static void main(String[] args) {
        String ion = "{message: \"Hello, world\", timestamp:2021-03-10T01:49:30.242-00:00}";
        IonObjectMapper mapper = IonObjectMapper.builder()
                .addModule(new IonValueModule())
                .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
                .build();

        Message message;
        try {
            message = mapper.readValue(ion, Message.class);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }

        System.out.println(message.getMessage());
    }
}

Results in:

Exception in thread "main" java.lang.RuntimeException: com.fasterxml.jackson.databind.JsonMappingException: com.fasterxml.jackson.databind.util.TokenBuffer cannot be cast to com.fasterxml.jackson.dataformat.ion.IonGenerator
    at Test.main(Test.java:41)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: com.fasterxml.jackson.databind.util.TokenBuffer cannot be cast to com.fasterxml.jackson.dataformat.ion.IonGenerator
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319)
    at com.fasterxml.jackson.databind.ObjectMapper.writeValue(ObjectMapper.java:3126)
    at com.fasterxml.jackson.databind.util.TokenBuffer.writeObject(TokenBuffer.java:938)
    at com.fasterxml.jackson.databind.util.TokenBuffer._copyBufferValue(TokenBuffer.java:1247)
    at com.fasterxml.jackson.databind.util.TokenBuffer.copyCurrentStructure(TokenBuffer.java:1148)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:514)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1390)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:362)
    at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:195)
    at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4593)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3548)
    at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3516)
    at Test.main(Test.java:38)
Caused by: java.lang.ClassCastException: com.fasterxml.jackson.databind.util.TokenBuffer cannot be cast to com.fasterxml.jackson.dataformat.ion.IonGenerator
    at com.fasterxml.jackson.dataformat.ion.ionvalue.TimestampSerializer.serialize(TimestampSerializer.java:39)
    at com.fasterxml.jackson.dataformat.ion.ionvalue.TimestampSerializer.serialize(TimestampSerializer.java:29)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:480)
    ... 14 more

This looks to be because by default ObjectMapper will copy unknown fields into a TokenBuffer before figuring out how to handle them; it tries to use TimestampSerializer and passes in the TokenBuffer as the JsonGenerator, but since TokenBuffer is not an instance of IonGenerator the cast at line 39 fails.

I see two potential options:

  1. TimestampSerializer checks the type of JsonGenerator and tries to write out a string or int value if it's not an IonGenerator; OR
  2. IonObjectMapper overrides the BeanDeserializerFactory to provide a BeanDeserializer with ._deserializeUsingPropertyBased overrridden, implementing logic to handle unknown fields using an Ion-based TokenBuffer or other such mechanism instead of the Json TokenBuffer.
cowtowncoder commented 3 years ago

Version?

mgoertzen commented 3 years ago

com.fasterxml.jackson.dataformat:jackson-dataformat-ion:2.12.2 com.fasterxml.jackson.core:jackson-databind:2.12.2

from maven central.

cowtowncoder commented 3 years ago

I don't know too much about this module, but I suspect that a work-around to just pass values as "raw" objects through TokenBuffer might work as well.

For 2.13 I plan on having a mechanism for backends to use custom TokenBuffer instances (see https://github.com/FasterXML/jackson-databind/issues/2989) which seems like it might help here. But not yet implemented.

codebusta commented 3 years ago

Has anyone found a workaround for this one?

NickMohan commented 1 year ago

Any updates/workarounds found for this issue?

seadbrane commented 8 months ago

Any update on this? As it is still not working with 2.16.1 and the only workaround we have been able to find is to create a custom serializer.
However, it seems like the bigger issue is that a serializer should never be called - since it is a deserialization operation, and it very inefficient to re-serialize a value that will never be used - since it is an unknown property anyway.

seadbrane commented 8 months ago

Investigated more, and the behavior is inconsistent and dependent on if and where other fields are in the input data. If you add "count" before the timestamp, it works as expected - but not without it or if you add it after the timestamp. This seems especially problematic.

`` import org.junit.Test;

import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; import com.fasterxml.jackson.core.JsonProcessingException; import com.fasterxml.jackson.databind.DeserializationFeature; import com.fasterxml.jackson.databind.JsonMappingException; import com.fasterxml.jackson.dataformat.ion.IonObjectMapper;

public class TestTimestamp {

private static class Message {
    private final String message;
    private final Integer count;

    @JsonCreator
    public Message(@JsonProperty("message") String message,
            @JsonProperty("count") Integer count) {
        this.message = message;
        this.count = count;
    }
}

@Test
public void testCountFirst() throws JsonMappingException, JsonProcessingException {
    readValue("{count: 10, message: \"Hello, world\", timestamp:2021-03-10T01:49:30.242-00:00}");
}

@Test
public void testCountBeforeTimestamp() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", count: 10, timestamp:2021-03-10T01:49:30.242-00:00}");
}

@Test(expected = JsonMappingException.class)
public void testNoCount() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", timestamp:2021-03-10T01:49:30.242-00:00}");
}

@Test(expected = JsonMappingException.class)
public void testCountLast() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", timestamp:2021-03-10T01:49:30.242-00:00, count:10}");
}

@Test(expected = JsonMappingException.class)
public void testOtherUnknownFieldLast() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", timestamp:2021-03-10T01:49:30.242-00:00, unknown:10}");
}

@Test(expected = JsonMappingException.class)
public void testOtherUnknownField() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", unknown:10, timestamp:2021-03-10T01:49:30.242-00:00}");
}

@Test(expected = JsonMappingException.class)
public void testCountLastAndOtherUnknownField() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", unknown:10, timestamp:2021-03-10T01:49:30.242-00:00, count:100}");
}

@Test
public void testCountAndOtherUnknownField() throws JsonMappingException, JsonProcessingException {
    readValue("{message: \"Hello, world\", count:10, timestamp:2021-03-10T01:49:30.242-00:00, unknown:100}");
}

private static void readValue(String ion) throws JsonMappingException, JsonProcessingException {
    IonObjectMapper mapper = IonObjectMapper.builder()
        .addModule(new IonValueModule())
        .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
        .build();

    mapper.readValue(ion, Message.class);
}

} ``

cowtowncoder commented 8 months ago

I probably did not write part of the code relevant here, but I do know why serialization is invoked: because of the need to buffer incoming content: values that cannot yet be used -- which initially is anything that does not get passed via Construct arguments, such as ignored timestamp field -- are buffered in TokenBuffer. For regular tokens (JsonToken.VALUE_xxx) this is simple operation, but I suspect Ion Timestamp values have some special handling that is problematic here. This also explains why order matters: buffering is not needed (and is optimized out, basically), if all Constructor arguments are received first, bound, and Constructor called. That is not Ion specific. It is sufficient to have test cases for breaking case, fwtw.

cowtowncoder commented 8 months ago

Checked failing test based on original report: that should be sufficient to reproduce and hopefully fix the issue.

seadbrane commented 8 months ago

Thanks for the quick fix!

cowtowncoder commented 8 months ago

@seadbrane Thank you for your help! Turns out it was much simpler than I thought, fortunately.

seadbrane commented 8 months ago

That's similar to the workaround we were going to use, although just seemed odd that the change would really go in the serializer.
This was the only difference in the TokenBuffer case - although not sure it is/was needed, but it is what the precursor to Jackson-dataformat-ion had.

if (provider.isEnabled(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS)) { jsonGenerator.writeNumber(timestamp.getMillis()); } else { jsonGenerator.writeString(timestamp.toString()); }

cowtowncoder commented 8 months ago

It is preferable to use writeEmbeddedObject() to avoid actual serialization/deserialization, but writing as String would also work, as above.

cowtowncoder commented 8 months ago

Also: the reason code tries casting to Ion-specific generator is to use native datatype, via method only available on that generation -- but we can retain value as-is, "embedded", for TokenBuffer. On "read" from TokenBuffer it is then exposes as Ion Timestamp so we retain previously decoded value. This is pattern to use for format-specific extensions; not super clean but works quite well in practice.