FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
310 stars 133 forks source link

CBOR loses `Map` entries with specific `long` Map key values (32-bit boundary) #269

Closed Quantum64 closed 3 years ago

Quantum64 commented 3 years ago

The issue only appears to occur with large keys of opposite sign. See the below reproducer.

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.cbor.CBORFactory;

import java.util.HashMap;
import java.util.Map;

public class CBORLongMapReproducer {
    private static final ObjectMapper json = new ObjectMapper();
    private static final ObjectMapper cbor = new ObjectMapper(new CBORFactory());

    public static void main(String[] args) throws Exception {
        Map<Long, String> map = new HashMap<>();
        map.put(4294967296L, "hello");
        map.put(-4294967296L, "world");
        TestData data = new TestData();
        data.setMap(map);

        Map<Long, String> resultCbor = cbor.readValue(cbor.writeValueAsBytes(data), TestData.class).getMap();
        Map<Long, String> resultJson = json.readValue(json.writeValueAsBytes(data), TestData.class).getMap();

        System.out.println("JSON size: " + resultJson.size());
        System.out.println("CBOR size: " + resultCbor.size());
    }

    public static class TestData {
        private Map<Long, String> map;

        public void setMap(Map<Long, String> map) {
            this.map = map;
        }

        public Map<Long, String> getMap() {
            return map;
        }
    }
}

This program produces the following output:

JSON size: 2
CBOR size: 1
cowtowncoder commented 3 years ago

Quick question @Quantum64: which version? (2.12.2 ideally)

Quantum64 commented 3 years ago

It's 2.12.2. The issue only occurs with specific positive/negative key pairs that seem to be relatively rare. I have an algorithm that can generate an arbitrary number of them if you need more. This issue caused a very difficult to diagnose bug in our codebase as you might imagine. :)

cowtowncoder commented 3 years ago

@Quantum64 yes, that sounds like an absolutely nightmare bug from user perspective. Thank you for reporting this: I'll have a look. Without looking I am guessing it'd have to do with auto-detection of smallest unit to use, and asymmetry b/w positive/negative 2-complement numbers (one more negative value than positive). And just speculating, right at boundary of 32-bit unsigned number.

cowtowncoder commented 3 years ago

Oh. And it's Longs as keys... that's interesting. CBOR being different from JSON, Smile etc in allowing native non-String keys (which is a royal PITA to support but I digress :) ).

cowtowncoder commented 3 years ago

I can reproduce this, and looks like this might actually be on generator side (... which is unfortunate). But definitely an edge condition like you say, near by values do not get truncated to ints (leading to first corrupted value, and then mismatch).

cowtowncoder commented 3 years ago

Ok it was two-part problem, actually. Partially there was incomplete fix wrt #30 for long-key case; partly missing handling on generator side.

Fixed for upcoming 2.12.3, but backported in 2.11 branch too: just not sure if and how to release; so far was hoping to avoid releasing anything more from 2.11, but this bug might qualify. That should probably be a micro-patch but that gets bit trickier with multi-project repos.

But first it'd be great if you could validate/verify the fix. I did push

of jackson-dataformat-cbor. Or you could build from 2.11 or 2.12 branch locally.

It'd be great to make sure this covers all cases: I did find couple of other ones beyond 2 values listed and think handling is fixed but better safe than sorry.

Quantum64 commented 3 years ago

I can confirm that we are no longer observing any issues on 2.12.3-SNAPSHOT. Thanks for the quick fix!

cowtowncoder commented 3 years ago

@Quantum64 thank you! I hope to release 2.12.3 within next 2 weeks, depending on whether other issues are found in near future.