Memory leak when serializing collections

GoogleCodeExporter commented 8 years ago

When trying to serialize a list with 1900 elements (simple beans which have 
some string, int and date props) - total size of 1.5 MB, memory consumption 
will exceed 2 GB and java OOM error will occur. Serializing same object with 
just 340 elements works normally.

Disabling references will cause a stack overflow.

Reproducing: 

I'm trying to serialize an Object[2] array with first element being instance of 
DocumentList, with list member being an ArrayList with 1900 instances of 
Document class (no large data in there, all 1900 elements occupy a bit under 2 
MB in database). See attachment for DocumentList and Document classes. Second 
element of object[] is null. 

Exception created : java.lang.OutOfMemoryError: Java heap space
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.resize(IdentityObjectIntMap.java:410)
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:227)
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:221)
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:117)
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.putStash(IdentityObjectIntMap.java:228)
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.push(IdentityObjectIntMap.java:221)
    at com.esotericsoftware.kryo.util.IdentityObjectIntMap.put(IdentityObjectIntMap.java:117)
    at com.esotericsoftware.kryo.util.MapReferenceResolver.addWrittenObject(MapReferenceResolver.java:23)
    at com.esotericsoftware.kryo.Kryo.writeReferenceOrNull(Kryo.java:598)
    at com.esotericsoftware.kryo.Kryo.writeObjectOrNull(Kryo.java:539)
    at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:570)
    at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
    at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
    at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:75)
    at com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:18)
    at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:501)
    at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
    at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
    at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
    at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:318)
    at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.write(DefaultArraySerializers.java:293)
    at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:486)

Original issue reported on code.google.com by rok.lena...@gmail.com on 10 Apr 2013 at 1:48

Attachments:

classes.txt

GoogleCodeExporter commented 8 years ago

You say disabling references will cause a stack overflow but from your 
stacktrace it looks like you have references enabled.

Does it happen running against trunk?

Can you post an executable code example?

Original comment by nathan.s...@gmail.com on 11 Apr 2013 at 4:32

GoogleCodeExporter commented 8 years ago

When references are enabled, as they are in the displayed stack trace, I get 
OutOfMemoryError. 

Further investigation showed that when disabling references I get an exception 
somewhere else and I try to serialize the exception which has a reference to 
itself in the cause so that's why I got the stack overflow. So that's a false 
flag.

So turning the references off now works (serialization works). But with 
references on, it still doesn't work. I'll work at providing a short example.

Original comment by rok.lena...@gmail.com on 11 Apr 2013 at 7:53

GoogleCodeExporter commented 8 years ago

I managed to put together an example. Code from this ZIP file produces an 
OutOfMemoryError with Kryo 2.21, BUT only on IBM JVMs. It works on Sun JVMs. I 
hope some of you have access to IBM JVMs. One big difference between sun and 
ibm is that on IBM JVMs "new String(String)" constructor isn't a copy 
constructor(doesn't allocate a new char array, but only makes a new String 
object (as if calling substring(0)). Thus on IBM JVMs calling "new 
String(aHugeString.substring(2,3))" will maintain a reference to the huge char 
array of "aHugeString" through the new String (potential memory leak). If that 
turns out to be the cause of the problem, use 
"".concat(aHugeString.substring(2,3)) to create a string copy with short 
internal array.

Hopefully someone here has IBM JVM handy.

Original comment by rok.lena...@gmail.com on 22 Apr 2013 at 2:12

Attachments:

KryoBugTest.zip

GoogleCodeExporter commented 8 years ago

Here's also an image of the local variables and stack trace at the moment of 
OutOfMemoryError getting thrown.

Note the difference between size, stash capacity, capacity and keyTable length.

Original comment by rok.lena...@gmail.com on 22 Apr 2013 at 2:36

Attachments:

Debug.jpg

GoogleCodeExporter commented 8 years ago

Resolved with a work-around: Made my own ReferenceResolver which uses binary 
search over an array (middle road between Map and List resolvers).

Original comment by rok.lena...@gmail.com on 21 May 2013 at 1:46

GoogleCodeExporter commented 8 years ago

ObjectMap is used in many projects, I would if there is something wrong there 
on an IBM VM or if it is the strings as you mentioned.

Original comment by nathan.s...@gmail.com on 11 Jun 2013 at 7:55

GoogleCodeExporter commented 8 years ago

IBM JVM has a number of subtle differences from Sun/Oracle JVM, most notably, 
the String(String) constructor is not a copy constructor (though that can't be 
the cause in this case). However I do not know what causes ObjectMap to fail. I 
am pretty sure my code is correct, as it works as soon as I change JVM from IBM 
to Sun. I have the misfortune of having to work with IBM products though so 
switching the VM wasn't an option.

Original comment by rok.lena...@gmail.com on 12 Jun 2013 at 7:45

GoogleCodeExporter commented 8 years ago

Hi Rok, can you please post your workaround. I'm having same issue as you had.

Original comment by mRi...@gmail.com on 4 Oct 2013 at 8:35

GoogleCodeExporter commented 8 years ago

<code>public class BiListReferenceResolver implements 
com.esotericsoftware.kryo.ReferenceResolver {
    protected Kryo kryo;
    protected final ArrayList<Object> readObjects = new ArrayList<Object>();
    private int[] writtenObjectsHashes = new int[10];
    // objects, then values
    private Object[] writtenObjectsAndValues = new Object[20];
    private int size = 0;
    private int primaryArraySize = 0;
    public void setKryo(Kryo kryo) {
        this.kryo = kryo;
    }

    private static int binarySearch(int[] array, int startIndex, int endIndex,
            int value) {
        int low = startIndex, mid = -1, high = endIndex - 1;
        while (low <= high) {
            mid = (low + high) >>> 1;
            if (value > array[mid]) {
                low = mid + 1;
            } else if (value == array[mid]) {
                return mid;
            } else {
                high = mid - 1;
            }
        }
        if (mid < 0) {
            int insertPoint = endIndex;
            for (int index = startIndex; index < endIndex; index++) {
                if (value < array[index]) {
                    insertPoint = index;
                }
            }
            return -insertPoint - 1;
        }
        return -mid - (value < array[mid] ? 1 : 2);
    }

    public int addWrittenObject(Object object) {
        int id = size;
        int hash = System.identityHashCode(object);
        int idx = binarySearch(writtenObjectsHashes, 0, primaryArraySize, hash);
        if (idx < 0) {
            idx = -(idx + 1);
            if (primaryArraySize == writtenObjectsHashes.length) {
                int[] newHashArray = new int[(writtenObjectsHashes.length * 3) / 2];
                System.arraycopy(writtenObjectsHashes, 0, newHashArray, 0, writtenObjectsHashes.length);
                writtenObjectsHashes = newHashArray;
                Object[] newObjectArray = new Object[newHashArray.length*2];
                System.arraycopy(writtenObjectsAndValues, 0, newObjectArray, 0, writtenObjectsAndValues.length);
                writtenObjectsAndValues = newObjectArray;
            }
            for(int i = writtenObjectsHashes.length-1;i > idx ;i--) {
                int j = 2*i;
                writtenObjectsHashes[i] = writtenObjectsHashes[i-1];
                writtenObjectsAndValues[j] = writtenObjectsAndValues[j-2];
                writtenObjectsAndValues[j+1] = writtenObjectsAndValues[j-1];
            }
            writtenObjectsHashes[idx] = hash;
            writtenObjectsAndValues[2*idx] = object;
            writtenObjectsAndValues[2*idx+1] = id;
            primaryArraySize++;
            size++;
            return id;
        } else {
            idx = 2 * idx; // objects and values array has bigger indexes
            if (writtenObjectsAndValues[idx+1] instanceof Integer) {
                // single slot
                if (writtenObjectsAndValues[idx] == object) {
                    return (Integer)writtenObjectsAndValues[idx+1];
                } else {
                    Object[] keys = new Object[4];
                    int[] values = new int[4];
                    keys[0] = writtenObjectsAndValues[idx];
                    values[0] = (Integer)writtenObjectsAndValues[idx+1];
                    keys[1] = object;
                    values[1] = id;
                    writtenObjectsAndValues[idx] = keys;
                    writtenObjectsAndValues[idx+1] = values; 
                    size++;
                    return id;
                }
            } else {
                // multiple entry slot
                Object[] keys = (Object[])writtenObjectsAndValues[idx];
                for(int i = 0;i < keys.length;i++) {
                    if (keys[i] == object) return ((int[])writtenObjectsAndValues[idx+1])[i];
                    if (keys[i] == null) {
                        keys[i] = object;
                        ((int[])writtenObjectsAndValues[idx+1])[i] = id;
                        size++;
                        return id;
                    } 
                }
                // expand
                Object[] newKeys = new Object[(keys.length * 3) / 2];
                System.arraycopy(keys, 0, newKeys, 0, keys.length);
                newKeys[keys.length] = object;
                int[] newValues = new int[(keys.length * 3) / 2];
                System.arraycopy((int[])writtenObjectsAndValues[idx+1], 0, newValues, 0, keys.length);
                writtenObjectsAndValues[idx] = newKeys;
                writtenObjectsAndValues[idx+1] = newValues;
                size++;
                return id;
            }

        }
    }

    public int getWrittenId(Object object) {
        int hash = System.identityHashCode(object);
        int idx = binarySearch(writtenObjectsHashes, 0, primaryArraySize, hash);
        if (idx < 0) {
            return -1;
        } else {
            idx = 2 * idx; // objects and values array has bigger indexes
            if (writtenObjectsAndValues[idx+1] instanceof Integer) {
                // single slot
                if (writtenObjectsAndValues[idx] == object) {
                    return (Integer)writtenObjectsAndValues[idx+1];
                } else {
                    return -1;
                }
            } else {
                // multiple entry slot
                Object[] keys = (Object[])writtenObjectsAndValues[idx];
                for(int i = 0;i < keys.length;i++) {
                    if (keys[i] == object) return ((int[])writtenObjectsAndValues[idx+1])[i];
                    if (keys[i] == null) return -1;
                }
                return -1;
            }
        }

    }

    @SuppressWarnings("rawtypes")
    public int nextReadId(Class type) {
        int id = readObjects.size();
        readObjects.add(null);
        return id;
    }

    public void setReadObject(int id, Object object) {
        readObjects.set(id, object);
    }

    @SuppressWarnings("rawtypes")
    public Object getReadObject(Class type, int id) {
        return readObjects.get(id);
    }

    public void reset() {
        readObjects.clear();
        size = 0;
        primaryArraySize = 0;
        writtenObjectsAndValues = new Object[20];
        writtenObjectsHashes = new int[10];
    }

    /** Returns false for all primitive wrappers. */
    @SuppressWarnings("rawtypes")
    public boolean useReferences(Class type) {
        return !Util.isWrapperClass(type) && !type.equals(String.class) && !type.equals(Date.class) && !type.equals(BigDecimal.class) && !type.equals(BigInteger.class);
    }

    public void addReadObject(int id, Object object) {
        while (id >= readObjects.size()) readObjects.add(null);
        readObjects.set(id, object);
    }
}
</code>

Then register it with:
k.setReferenceResolver(new BiListReferenceResolver());

Original comment by rok.lena...@gmail.com on 4 Oct 2013 at 10:01

GoogleCodeExporter commented 8 years ago

Scratch that, that resolver is slow as hell. I made a new one just using a 
hashmap.

public class ReferenceResolver implements 
com.esotericsoftware.kryo.ReferenceResolver {
    protected Kryo kryo;
    protected Map<Value, Integer> map = new HashMap<Value, Integer>();
    protected final ArrayList<Object> readObjects = new ArrayList<Object>();
    private static class Value {
        private Object val;
        private int hash;
        public Value(Object val) {
            this.val = val;
            this.hash = System.identityHashCode(val);
        }

        @Override
        public int hashCode() {
            return hash;
        }
        @Override
        public boolean equals(Object obj) {
            return val == ((Value) obj).val;
        }

    }
    public void setKryo(Kryo kryo) {
        this.kryo = kryo;
    }

    public int addWrittenObject(Object object) {
        Value v = new Value(object);
        Integer i = map.get(v);
        if (i == null) {
            int ret = map.size();
            map.put(v, ret);
            return ret; 
        } else {
            return i;
        }
    }

    public int getWrittenId(Object object) {
        Value v = new Value(object);
        Integer i = map.get(v);
        if (i == null) {
            return -1; 
        } else {
            return i;
        }       
    }

    @SuppressWarnings("rawtypes")
    public int nextReadId(Class type) {
        int id = readObjects.size();
        readObjects.add(null);
        return id;
    }

    public void setReadObject(int id, Object object) {
        readObjects.set(id, object);
    }

    @SuppressWarnings("rawtypes")
    public Object getReadObject(Class type, int id) {
        return readObjects.get(id);
    }

    public void reset() {
        readObjects.clear();
        map.clear();
    }

    /** Returns false for all primitive wrappers. */
    @SuppressWarnings("rawtypes")
    public boolean useReferences(Class type) {
        return !Util.isWrapperClass(type) && !type.equals(String.class) && !type.equals(Date.class) && !type.equals(BigDecimal.class) && !type.equals(BigInteger.class);
    }

    public void addReadObject(int id, Object object) {
        while (id >= readObjects.size()) readObjects.add(null);
        readObjects.set(id, object);
    }
}

Should be around 8 times faster.

Original comment by rok.lena...@gmail.com on 4 Oct 2013 at 11:38

GoogleCodeExporter commented 8 years ago

[deleted comment]

GoogleCodeExporter commented 8 years ago

Thanks a lot man. Works like a charm!

Original comment by mRi...@gmail.com on 4 Oct 2013 at 12:49

Candyjing1024 / kryo

Memory leak when serializing collections #108