andrewbissada / kryo

Automatically exported from code.google.com/p/kryo
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Serialization of Class.class fails #113

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

I am evaluating your library. For the moment it works nicely, I just have a 
small problem.

I have one class (let us call it "MyClass") that contains the following field:

    private final Class<?> contentType;

I have problems when serializing that field. If I create a serializer such as:

    public static class MyClassSerializer extends Serializer<MyClass> {
        @Override
        public void write(final Kryo kryo, final Output output, final MyClass object) {
            ...
            kryo.writeObject(output, object.contentType);
            ...
        }

        @Override
        public MyClass read(final Kryo kryo, final Input input, final Class<MyClass> type) {
            ...
            kryo.readObject(input, Class.class);
            ...
        }
    }

it does not work properly because java.lang.Integer is deserialized as 
primitive "int". The workaround that I used to make it working is the following:

    public static class MyClassSerializer extends Serializer<MyClass> {
        @Override
        public void write(final Kryo kryo, final Output output, final MyClass object) {
            ...
            output.writeString(object.contentType.getName());
            ...
        }

        @Override
        public MyClass read(final Kryo kryo, final Input input, final Class<MyClass> type) {
            ...
            Class tmpContentType;
            try {
                tmpContentType = Class.forName(input.readString());
            } catch (final ClassNotFoundException e) {
                throw new RuntimeException(e);
            }
            ...
        }
    }

It works pretty well, although it is not so elegant. So I wrote a custom 
serializer for Class.class:

    public final static Serializer<Class<?>> SIMPLE_CLASS_SERIALIZER = new Serializer<Class<?>>() {
        @Override
        public void write(final Kryo kryo, final Output output, final Class<?> object) {
            output.writeString(output.toString());
        }

        @Override
        public Class<?> read(final Kryo kryo, final Input input, final Class<Class<?>> type) {
            try {
                return Class.forName(input.readString());
            } catch (final Exception e) {
                throw new RuntimeException(e);
            }
        }
    };

but the following code does not work as expected:

    public static class MyClassSerializer extends Serializer<MyClass> {
        @Override
        public void write(final Kryo kryo, final Output output, final MyClass object) {
            ...
            kryo.writeObject(output, object.contentType, SIMPLE_CLASS_SERIALIZER);
            ...
        }

        @Override
        public MyClass read(final Kryo kryo, final Input input, final Class<MyClass> type) {
            ...
            kryo.readObject(input, Class.class, SIMPLE_CLASS_SERIALIZER);
            ...
        }
    }

Indeed I get the following exception:

        java.lang.ClassNotFoundException: com.esotericsoftware.kryo.io.Output@5b10347e

Obviously I never put that class in my field "contentType", and notice that the 
previous workaround works well.

What should I do?

Thanks

Alessandro

What version of the Kryo are you using?

2.21

Original issue reported on code.google.com by alessand...@gmail.com on 22 May 2013 at 9:47

GoogleCodeExporter commented 9 years ago
Hi Alessandro,

Thanks for your bug report. Before I can look deeper into it, could you please 
answer the following question:
1) Do you register MyClassSerializer with your Kryo instance? If not, have you 
tried to do it?
2) Do you have a self-contained test-case, which reproduces this problem?

Thanks,
  Leo

Original comment by romixlev on 23 Jul 2013 at 9:07

GoogleCodeExporter commented 9 years ago
Here is a test case:

import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.Serializer;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;

public class KryoTest {

    final Class<?> clazz;

    static class KryoTestSerializerWrong extends Serializer<KryoTest> {
        @Override
        public void write(final Kryo kryo, final Output output, final KryoTest object) {
            kryo.writeObject(output, object.clazz);
        }

        @Override
        public KryoTest read(final Kryo kryo, final Input input, final Class<KryoTest> type) {
            final Class<?> c = kryo.readObject(input, Class.class);
            return new KryoTest(c);
        }
    }

    static class KryoTestSerializerCorrect extends Serializer<KryoTest> {
        @Override
        public void write(final Kryo kryo, final Output output, final KryoTest object) {
            output.writeString(object.clazz.getName());
        }

        @Override
        public KryoTest read(final Kryo kryo, final Input input, final Class<KryoTest> type) {
            final Class<?> c;
            try {
                c = Class.forName(input.readString());
            } catch (final ClassNotFoundException e) {
                throw new RuntimeException(e);
            }
            return new KryoTest(c);
        }
    }

    KryoTest(final Class<?> clazz) {
        this.clazz = clazz;
    }

    public static void main(final String[] args) {
        final Kryo kryo = new Kryo();
        final Output out = new Output(1024);

        final KryoTest test1 = new KryoTest(String.class);
        final KryoTest test2 = new KryoTest(Integer.class);
        final KryoTest test3 = new KryoTest(Double.class);

        final KryoTestSerializerWrong serializerWrong = new KryoTestSerializerWrong();
        kryo.writeObject(out, test1, serializerWrong);
        kryo.writeObject(out, test2, serializerWrong);
        kryo.writeObject(out, test3, serializerWrong);

        final KryoTestSerializerCorrect serializerCorrect = new KryoTestSerializerCorrect();
        kryo.writeObject(out, test1, serializerCorrect);
        kryo.writeObject(out, test2, serializerCorrect);
        kryo.writeObject(out, test3, serializerCorrect);

        final Input in = new Input(out.getBuffer());
        KryoTest res;

        res = kryo.readObject(in, KryoTest.class, serializerWrong);
        System.out.println(res.clazz);
        res = kryo.readObject(in, KryoTest.class, serializerWrong);
        System.out.println(res.clazz);
        res = kryo.readObject(in, KryoTest.class, serializerWrong);
        System.out.println(res.clazz);
        res = kryo.readObject(in, KryoTest.class, serializerCorrect);
        System.out.println(res.clazz);
        res = kryo.readObject(in, KryoTest.class, serializerCorrect);
        System.out.println(res.clazz);
        res = kryo.readObject(in, KryoTest.class, serializerCorrect);
        System.out.println(res.clazz);
    }
}

The output that you get is:

class java.lang.String
int
double
class java.lang.String
class java.lang.Integer
class java.lang.Double

whereas it should have been:

class java.lang.String
class java.lang.Integer
class java.lang.Double
class java.lang.String
class java.lang.Integer
class java.lang.Double

Original comment by alessan...@bay31.com on 23 Jul 2013 at 9:33

GoogleCodeExporter commented 9 years ago
Thanks for the test-case. I can confirm that it is a bug. 

It seems that the current of DefaultSerializers.ClassSerializer implementation 
is buggy and this hits you.
The problem is that it currently actually writes a type that is mapped to the 
Class you are trying to write. So, for Integer.class it detects that int.class 
is mapped to it (this is a default pre-registered mapping) and writes out the 
registration id for int.class. As a result, when you read back the serialized 
representation, Kryo sees the class id of int and deserializes it as int.class.

I think the proper fix would include something similar to what you do in 
KryoTestSerializerCorrect, i.e. it would need to write class names. In 
particular, it would be required if you want to write classes that are not 
registered in Kryo.

Alternatively, if all the classes  that are possible values of clazz are 
registered in Kryo, the serializer could simply write their registered id as an 
integer. A deserializer would do the opposite. 

@Nate: What do you think about the proposed solutions? Which one should we 
take? Or do see any other options

Original comment by romixlev on 23 Jul 2013 at 11:49

GoogleCodeExporter commented 9 years ago
Thanks for your kind reply. Since I potentially have classes that are not 
registered, I am currently using the solution implemented in 
KryoTestSerializerCorrect. Obviously, it uses far more bytes than simply 
writing the ID of the registered class, but in my case I need a general 
approach --- and I only care about speed, not space.

If you are able to find a more elegant solution, it will certainly be 
appreciated.

Thanks again,

Alessandro

Original comment by alessan...@bay31.com on 23 Jul 2013 at 11:56

GoogleCodeExporter commented 9 years ago
Hi Alessandro,

Just a proposal for your use-case: In your custom serializer you keep a mapping 
from classes to some non-zero integer ids dynamically assigned by you. You also 
keep an array id2class (or a map), where id2class[assigned-id] == mapped class.

In serializer, first time you see a class name that is not in the map, you 
write:
0, classname, newly assigned unique id
If you see a class that is in the map already, you write:
assigned id

And for deserialization:
If you read 0, then you need to read a class name and assigned it and register 
it in the map (if not there yet)
or if you read a non-zero, it is an assigned it. So, you just perform a lookup 
in id2class

What do you think? Would it help you?

-Leo

Original comment by romixlev on 23 Jul 2013 at 1:03

GoogleCodeExporter commented 9 years ago
Writing a class name is a bit nasty. I think it could work if the 
ClassSerializer detects the class is a primitive and writes something to 
differentiate between primitive and primitive wrapper. Eg, for int.class it 
always serializes the registered ID for Integer.class, then writes an 
additional byte that is 1 when the primitive should be used instead.

Original comment by nathan.s...@gmail.com on 5 Aug 2013 at 12:28

GoogleCodeExporter commented 9 years ago
I implemented your proposal, Nate. It is now in trunk together with a unit test 
which checks serialization of all primitive classes and all wrapper classes of 
primitive classes.

It also checks that non-primitive, non-wrapper classes and non-registered 
classes (e.g. implicitly registered classes) are properly serialized. 

@allesandro.colantonio: Does it solve your problems? Can we close this issue?

Original comment by romixlev on 22 Aug 2013 at 9:06

GoogleCodeExporter commented 9 years ago
Hi Leo,

I cannot verify your code right now since I am on vacation. I'll do it in a few 
days. But if your implementation works with the test case that I proposed in my 
previous post, then you can definitely close the issue!

Thanks a lot for your support!

Alessandro

Original comment by alessand...@gmail.com on 22 Aug 2013 at 9:54

GoogleCodeExporter commented 9 years ago

Original comment by romixlev on 26 Aug 2013 at 3:16

GoogleCodeExporter commented 9 years ago
Hi,

How i explicitly register class in kryo, please suggest the method, how can i 
do this.

Please help me.

why i need to explicitly register class in kryo ?

Original comment by ak3...@gmail.com on 23 Apr 2015 at 4:39

GoogleCodeExporter commented 9 years ago
I have a below issue, so my frnds suggest to resister the class explicitly.

"past 2 days, I’ve been running the same Hive queries. ~80% of the time they 
fail with this KyroException, thrown from the Hive code. 20% of the time the 
same job, runs successfully. No code differences between runs.

Yesterday, this same Hive Query failed with the KyroException all afternoon. 
Then around 8:00 PM at night it ran fine. This morning (4/23), I ran the same 
code again, processing 2 rows of data, and the job failed."

Below are the logs :

2015-04-23 10:35:57,893 INFO [main] org.apache.hadoop.hive.ql.exec.Utilities: 
Deserializing MapWork via kryo
2015-04-23 10:35:58,290 ERROR [main] org.apache.hadoop.hive.ql.exec.Utilities: 
Failed to load plan: 
hdfs://nameservice1/tmp/hive-dks0344135/hive_2015-04-23_10-34-25_211_23338155210
43174815-4/-mr-10016/b7fe67d9-5471-4e66-bdf4-6280f840f5ec/map.xml
org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered 
unregistered class ID: -848874534
Serialization trace:
startTimes (org.apache.hadoop.hive.ql.log.PerfLogger)
perfLogger (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.UnionOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(De
faultClassResolver.java:119)

Original comment by ak3...@gmail.com on 23 Apr 2015 at 4:59

GoogleCodeExporter commented 9 years ago
How i check the kryo version ? please help me on this.

Original comment by ak3...@gmail.com on 23 Apr 2015 at 5:32