ellecer / cqengine

Automatically exported from code.google.com/p/cqengine
0 stars 0 forks source link

IndexedCollection is not Serializable #12

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Try to serialize IndexedCollection

What is the expected output? What do you see instead?
Serializable IndexedCollection

What version of the product are you using? On what operating system?
1.0.3 on Mac OS X

Please provide any additional information below.
Support for serialization would be great. In this case user would be able to 
setup some indexes, serialize the entity and later retrieve it already with 
indexes.

Original issue reported on code.google.com by mitja.kr...@outfit7.com on 5 Feb 2013 at 2:47

GoogleCodeExporter commented 9 years ago
Thanks mitja for the request. I can see how this could be useful.

It's probably the case that most of the in-memory indexes, could be rebuilt in 
memory faster than they could be deserialized from disk. So only the objects in 
the collection would need to be serialized/deserialized, and then the indexes 
re-added. It sounds like a case for adding a readObject() method.

In the meantime, say with version 1.0.3, you could use the 
IndexedCollectionSerializer class below, to serialize an indexed collection. 
The main catch, is when deserialized, you *need to re-add the indexes!*. See 
the SerializerDemo class below for an example.

I'll think about better serialization support for the next release. Thanks!

--------------------------------------------------------------------------------
package com.googlecode.cqengine;
import com.googlecode.cqengine.index.radix.RadixTreeIndex;
import java.io.File;
public class SerializerDemo {

    public static void main(String[] args) {
        // *************** Build some collection... ***************
        IndexedCollection<Foo> myCollection = CQEngine.newInstance();
        addIndexesToMyCollection(myCollection);

        // Add some objects...
        myCollection.add(new Foo("bar"));
        myCollection.add(new Foo("baz"));

        // *************** Serialize the collection... ***************
        IndexedCollectionSerializer.serialize(myCollection, new File("foo.dat"));

        // *************** Deserialize the collection... ***************
        IndexedCollection<Foo> myDeserializedCollection = IndexedCollectionSerializer.deserialize(new File("foo.dat"));
        // Need to add indexes again to the deserialized collection!!...
        addIndexesToMyCollection(myDeserializedCollection);

        // ************ myDeserializedCollection should now have the same state as myCollection *******
    }

    static void addIndexesToMyCollection(IndexedCollection<Foo> indexedCollection) {
        indexedCollection.addIndex(RadixTreeIndex.onAttribute(Foo.NAME));
    }
}
--------------------------------------------------------------------------------
package com.googlecode.cqengine;
import com.googlecode.cqengine.attribute.Attribute;
import com.googlecode.cqengine.attribute.ReflectiveAttribute;
import java.io.Serializable;
public class Foo implements Serializable {
    public final String name;

    Foo(String name) {
        this.name = name;
    }

    public static final Attribute<Foo, String> NAME = ReflectiveAttribute.forField(Foo.class, String.class, "name");
}
--------------------------------------------------------------------------------
package com.googlecode.cqengine;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class IndexedCollectionSerializer {

    public static <O> void serialize(IndexedCollection<O> indexedCollection, File destination) {
        OutputStream os = null;
        try {
            os = new BufferedOutputStream(new FileOutputStream(destination));
            List<O> objectsList = new ArrayList<O>(indexedCollection);
            ObjectOutputStream oos = new ObjectOutputStream(os);
            oos.writeObject(objectsList);
            oos.flush();
        }
        catch (Exception e) {
            throw new IllegalStateException(e);
        }
        finally {
            if (os != null) {
                try { os.close(); } catch (Exception ignore) {}
            }
        }
    }

    public static <O> IndexedCollection<O> deserialize(File source) {
        ObjectInputStream ois = null;
        try {
            ois = new ObjectInputStream(new BufferedInputStream(new FileInputStream(source)));
            @SuppressWarnings({"unchecked", "UnnecessaryLocalVariable"})
            List<O> objectsList = (List<O>) ois.readObject();
            return CQEngine.copyFrom(objectsList);
        }
        catch (Exception e) {
            throw new IllegalStateException(e);
        }
        finally {
            if (ois != null) {
                try { ois.close(); } catch (Exception ignore) {}
            }
        }
    }
}
--------------------------------------------------------------------------------

Original comment by ni...@npgall.com on 6 Feb 2013 at 6:55

GoogleCodeExporter commented 9 years ago
Niall,

first of all thank you for very fast response. Unfortunately this is not my use 
case. I would use to store IndexedCollection to some Memcache engine(needs to 
implement Serializable) and not to file. So due the fact that you stated that 
serializing indexes would take to much time, it would be helpful to avoid  
"CQEngine.copyFrom" step, so that IndexedCollection could be "stored" directly 
without copying and again retrieved without copying, even if indexes must be 
added again.

Otherwise I need to say: Great project!!

Original comment by mitja.kr...@outfit7.com on 6 Feb 2013 at 7:18

GoogleCodeExporter commented 9 years ago
Usually memcache is used to serialize a single object per key. Whereas in this 
case you will store an entire collection against a single key? Will this be 
retrieved on application startup, or are you planning to do this for each 
request?

Memcache is basically "remote RAM". Which indeed is usually faster than local 
disk, but slower than local RAM. It might be worth looking at a distributed 
cache which supports local RAM with distributed eviction, instead of going 
across the network every time. Also take a look at Kryo as an alternative to 
Java serialization. It is much faster and does not require classes to implement 
the Serializable interface. I've not tested it with IndexedCollection, but I've 
had good results from it in the past.

Nonetheless, even if Kryo works with IndexedCollection right now, there are 
still a few optimizations to CQEngine which could improve serialization. I will 
add support to serialize the indexed collection without copyFrom in the next 
release.

Original comment by ni...@npgall.com on 8 Feb 2013 at 12:54