FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
310 stars 133 forks source link

IonValueDeserializer does not handle getNullValue correctly for a missing property #317

Closed atokuzAmzn closed 2 years ago

atokuzAmzn commented 2 years ago

Hello,

Following https://github.com/FasterXML/jackson-dataformats-binary/issues/295, we switched to 2.13 from 2.12 to be able to handle null.struct deserialization correctly but we bumped into a different problem in 2.13

We were trying to deserialize a test input into an entity with an optional IonValue field. We received IndexOutOfBoundsException when this property was missing in the input.

This is the stacktrace of the error:

at ExampleEntityTest.deserialization test without optional data(ExampleEntityTest.groovy:219)
Caused by: java.lang.IndexOutOfBoundsException: 0
at com.amazon.ion.impl.lite.IonContainerLite.get_child(IonContainerLite.java:663)
at com.amazon.ion.impl.lite.IonContainerLite.get(IonContainerLite.java:151)
at com.fasterxml.jackson.dataformat.ion.IonParser.getIonValue(IonParser.java:424)
at com.fasterxml.jackson.dataformat.ion.IonParser.getEmbeddedObject(IonParser.java:442)
at com.fasterxml.jackson.dataformat.ion.ionvalue.IonValueDeserializer.getNullValue(IonValueDeserializer.java:61)
at com.fasterxml.jackson.dataformat.ion.ionvalue.IonValueDeserializer.getNullValue(IonValueDeserializer.java:32)
at com.fasterxml.jackson.databind.JsonDeserializer.getAbsentValue(JsonDeserializer.java:350)
at com.fasterxml.jackson.databind.deser.impl.PropertyValueBuffer._findMissing(PropertyValueBuffer.java:203)
at com.fasterxml.jackson.databind.deser.impl.PropertyValueBuffer.getParameters(PropertyValueBuffer.java:158)
at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:288)
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:202)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:518)

The problem occurs when IonValueDeserializer tries to execute the overridden getNullValue for the missing property. In the beginning of this method it tries to get the embedded object from IonParser.

    @Override
    public IonValue getNullValue(DeserializationContext ctxt) throws JsonMappingException {
        try {
            Object embeddedObj = ctxt.getParser().getEmbeddedObject();
    ...

    }

IonParser's getEmbeddedObject method basically checks the token first but since in our case _currToken is END_OBJECT for the missing property so it skips the whole method and fallback to getIonValue().

Now in getIonValue _currToken is assigned to JsonToken.VALUE_EMBEDDED_OBJECT directly and it tries to write reader into an IonList. But since there is nothing to read for a missing property no IonValue is written into IonList.

And since there is no size check, in the next line when it tries to access the first value in the list we get an java.lang.IndexOutOfBoundsException

    @SuppressWarnings("resource")
    private IonValue getIonValue() throws IOException {
        if (_system == null) {
            throw new IllegalStateException("This "+getClass().getSimpleName()+" instance cannot be used for IonValue mapping");
        }
        _currToken = JsonToken.VALUE_EMBEDDED_OBJECT;
        IonList l = _system.newEmptyList();
        IonWriter writer = _system.newWriter(l);
        writer.writeValue(_reader);
        IonValue v = l.get(0);
        v.removeFromContainer();
        return v;
    }

    @Override
    public Object getEmbeddedObject() throws IOException {
        if (_currToken == JsonToken.VALUE_EMBEDDED_OBJECT) {
            switch (_reader.getType()) {
            case TIMESTAMP:
                return _reader.timestampValue();
            case BLOB:
            case CLOB:
                return _reader.newBytes();
            // What about CLOB?
            default:
            }
        }
        return getIonValue();
    }

Can someone confirm if this is a bug or not?

cowtowncoder commented 2 years ago

Can not comment on validity, but it'd be great to have a reproduction (test) to show the issue; explanations are useful but code is more accurate.

atokuzAmzn commented 2 years ago

Here is the test case I have added to DataBindReadTest class which shows the issue

    static class MyBean2 {
        public IonStruct required;
        public IonStruct optional;

        MyBean2(
            @JsonProperty("required") IonStruct required,
            @JsonProperty("optional") IonStruct optional
        ) {
            this.required = required;
            this.optional = optional;
        }
    }

    @Test
    public void testFailWithMissingProperty() throws IOException
    {
        IonSystem ionSystem = IonSystemBuilder.standard().build();
        IonObjectMapper ionObjectMapper = IonObjectMapper.builder(ionSystem)
            .addModule(new IonValueModule())
            .build();

        String input1 = "{required:{}, optional:{}}";
        MyBean2 deserializedBean1 = ionObjectMapper.readValue(input1, MyBean2.class);
        assertEquals(ionSystem.newEmptyStruct(), deserializedBean1.required);
        assertEquals(ionSystem.newEmptyStruct(), deserializedBean1.optional);

        // This deserialization fails with IndexOutOfBoundsException
        String input2 = "{required:{}}";
        MyBean2 deserializedBean2 = ionObjectMapper.readValue(input2, MyBean2.class);
        assertEquals(ionSystem.newEmptyStruct(), deserializedBean2.required);
    }
atokuzAmzn commented 2 years ago

The fix for the problem would be to add safety checks to IonValueDeserializer's overridden getNullValue method:

    @Override
    public IonValue getNullValue(final DeserializationContext ctxt) throws JsonMappingException {
        try {
            final JsonParser parser = ctxt.getParser();
            if (parser.getCurrentToken() == JsonToken.VALUE_NULL) {
                final Object embeddedObj = parser.getEmbeddedObject();
                if (embeddedObj instanceof IonValue) {
                    final IonValue iv = (IonValue) embeddedObj;
                    if (iv.isNullValue()) {
                        return iv;
                    }
                }
            }
            return super.getNullValue(ctxt);
        } catch (IOException e) {
            throw JsonMappingException.from(ctxt, e.toString());
        }
    }

The issue occurs when current token is END_OBJECT. Making sure current token is VALUE_NULL before IonParser is trying to get embedded object would prevent this error.

I am holding back the PR for this issue for now because while I was working on this problem I realized we should add one more check to getNullValue method. But since that use case is completely different from this one I will open a separate Issue for that discussion. And depending on the resolution I can send a PR which fixes both of the issues.

cowtowncoder commented 2 years ago

First of all: thank you for adding a reproduction!

On suggested fix: that code is unfortunately not usable as-is: getNullValue() MUST NOT assume there is active parsing context to check. That will not work reliably and should not be attempted; there might not be a parser, or if there is, it might point to something else when content is being buffered. It might (but might not) point to the null value.

atokuzAmzn commented 2 years ago

But isnt current implementation already assuming we have a parser? https://github.com/FasterXML/jackson-dataformats-binary/blob/2.14/ion/src/main/java/com/fasterxml/jackson/dataformat/ion/ionvalue/IonValueDeserializer.java#L61

Object embeddedObj = ctxt.getParser().getEmbeddedObject();

And actually it goes further and assumes reader has still something to read and calls getEmbeddedObject even when currToken is END_OBJECT.

This is actually the failure scenario we want to prevent, so maybe instead of a positive NULL check we can make a negative END_OBJECT check. This could be a much more reliable way.

Taking into account your comments and thinking along these lines, maybe we could modify the method like this:

    @Override
    public IonValue getNullValue(DeserializationContext ctxt) throws JsonMappingException {
        try {
            final JsonParser parser = ctxt.getParser();
            if (parser != null && parser.getCurrentToken() != JsonToken.END_OBJECT) {
                final Object embeddedObj = parser.getEmbeddedObject();
                if ((embeddedObj instanceof IonValue) && !(embeddedObj instanceof IonNull)) {
                    final IonValue iv = (IonValue) embeddedObj;
                    if (iv.isNullValue()) {
                        return iv;
                    }
                }
            }
            return super.getNullValue(ctxt);
        } catch (IOException e) {
            throw JsonMappingException.from(ctxt, e.toString());
        }
    }
atokuzAmzn commented 2 years ago

I pushed a PR with proposed changes for you to review:https://github.com/FasterXML/jackson-dataformats-binary/pull/319

cowtowncoder commented 2 years ago

@atokuzAmzn Sigh. If it is already making that incorrect assumption then I'm not against keeping it that way. It really should not be done but generally I am ok with "do not make an existing problem bigger" if necessary.

I'll let Ion module owners decide here since I am not the main maintainer at this point (once upon a time decade ago I did write the initial version but all advanced stuff is by Amazon Ion folks).

atokuzAmzn commented 2 years ago

PR for this issue is ready for review:https://github.com/FasterXML/jackson-dataformats-binary/pull/320