msgpack / msgpack-java

MessagePack serializer implementation for Java / msgpack.org[Java]
http://msgpack.org/
Apache License 2.0
1.4k stars 320 forks source link

Support efficient access to MapValue keys and values #808

Closed translatenix closed 3 months ago

translatenix commented 4 months ago

Goals:

  1. Efficiently iterate over a MapValue. Currently, MapValue can be iterated over by using one of the following methods:

    • MapValue.entrySet(), which allocates a Map.Entry object for every entry.
    • MapValue.getKeyValueArray(), which makes a defensive copy of the array.

    Iterating over MapValue.map() is no better than iterating over MapValue.entrySet().

  2. Efficiently access a value whose entry index is known or can likely be guessed correctly. A common use case is constructing a Java object from a MapValue whose pack order is likely known. Currently, there is no way to benefit from this knowledge:

    
    Value get(Map<Value, Value> map, String key) {
      // this is O(n) and produces garbage, could be O(1) and no garbage
      return map.get(ValueFactory.newString(key));
    }
    record LargeRecord(String foo, int bar, ...);
    var map = mapValue.map();
    var large = new LargeRecord(
      get(map, "foo").asStringValue().asString(), 
      get(map, "bar").asIntegerValue().asInt(), 
      ...
    );

Proposal:

To achieve the goals described above, I propose to add the following methods to MapValue:

Value getKey(int entryIndex);
Value getValue(int entryIndex);

Results:

  1. Iterating over MapValue produces no garbage:
    for (var i = 0; i < mapValue.size(); i++) {
      var key = mapValue.getKey(i);
      var value = mapValue.getValue(i);
    }
  2. Accessing a value whose entry index is known or guessed correctly is O(1) instead of O(n);
    Value get(MapValue mapValue, String stringKey, int guessedIndex) {
      var key = mapValue.getKey(guessedIndex);
      return key.isStringValue() && stringKey.equals(key.asStringValue().asString()) ?
        // fast path
        mapValue.getValue(guessedIndex) : 
        // slow path
        mapValue.map().get(ValueFactory.newString(key));
    }
    var large = new LargeRecord(
      get(mapValue, "foo", 0).asStringValue().asString(),
      get(mapValue, "bar", 1).asIntegerValue().asInt(),
      ...
    );

Alternatives:

The first goal can be partially achieved by overriding method java.util.Map.forEach() in class ImmutableMapValueMap. This is PR #809 . The second goal can sometimes be achieved by using Unpacker.unpackMapHeader() etc. instead of Unpacker.unpackValue().asMapValue().

If my proposal is approved, I'd be happy to send a PR.

translatenix commented 3 months ago

Ended up writing my own MessagePack library.