FasterXML / jackson-core

Core part of Jackson that defines Streaming API as well as basic shared abstractions
Apache License 2.0
2.25k stars 769 forks source link

Allow TokenFIlter to skip last elements in arrays #882

Closed pgomulka closed 1 year ago

pgomulka commented 1 year ago

When the last element in array is an array or object and that element is skipped, the FilteringParserDelegate will end up in a loop from which it cannot exit. This means that the rest of the input will be skipped too. This results in incorrect JSON.

This behaviour exists since 2.9+ I believe this is due to https://github.com/FasterXML/jackson-core/commit/7db467ddec7c2899038249b55695b7e44c7b5c3e#diff-f6642caef61e0c403f51a6150ecf45263034fca5002782fd02eacd01e53fe549L694 where the if (gotEnd) conditions where removed. I think this should be added as currently the logic is:

 boolean gotEnd = (_headContext == buffRoot);
                    boolean returnEnd = gotEnd && _headContext.isStartHandled();

                    _headContext = _headContext.getParent();
                    _itemFilter = _headContext.getFilter();

                    if (returnEnd) {
                        return t;
                    }

and that means that it can only exit when _headContext.isStartHandled() is true. For skipped elements this is false.

This can be easily reproduced with this testcase

 @Test
    public void testCustomIncludesWithMultipleObjectsInArrayMissLast() throws Exception {
        var factory = new JsonFactory();
        var baseParser = factory.createParser("{\"foo\":[{\"bar\":\"baz\"},{\"bing\":\"boom\"}]}");
        var filteredParser = getFilteredParser(baseParser, filter("foo", "bar"));
        var writer = new StringWriter();
        var generator = factory.createGenerator(writer);
        Assertions.assertTrue(filteredParser.nextToken().isStructStart());
        generator.copyCurrentStructure(filteredParser);
        generator.flush();
        Assertions.assertEquals("{\"foo\":[{\"bar\":\"baz\"}]}", writer.toString());
        //Expected :{"foo":[{"bar":"baz"}]}
        //Actual   :{"foo":[{"bar":"baz"} 
    }

cc @tvernum who coauthored the fix and the testcases

cowtowncoder commented 1 year ago

Sounds reasonable. I'd be happy to merge, but need CLA first: will add a note on PR.

pgomulka commented 1 year ago

@cowtowncoder should we close this issue manually? or will the team close this once the release is done?

cowtowncoder commented 1 year ago

No, I need to close it since it didn't automatically do so. Closing done when fix merged for specific branch.

Thanks!

pgomulka commented 1 year ago

@cowtowncoder how frequent are jackson releases? I am wondering when the 2.14.2 might be released. We want to fix the problem in Elasticsearch as well, upgrading to 2.14.2 would be easiest. If the 2.14.2 is not due to be released soon we will have to find some workaround.

cowtowncoder commented 1 year ago

It's the usual "when it's ready". Looking at:

https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.14.2

it's bit short list.

But.... potentially I could release 2.14.2 in January, esp. since you have clear need and have provided patches. I do try to work with users and developers in such cases. There are couple of possibly related PRs I'd like to get in first, but I guess there is nothing specifically blocking release.

Oh and maybe I should make sure I get Woodstox 6.5.0 released too.