Closed GoogleCodeExporter closed 8 years ago
Thanks for the report.
HashSet is a Collection, so normally uses CollectionSerializer. When you do your
compression, you register HashSet with a FieldSerializer. HashSet stores its
data in
transient fields, so the data is not serialized. This explains why your
HashSets are
empty.
When HashSet is registered with CollectionSerializer (which also happens if
registered without specifying a serializer), then it works properly but the
serialized bytes are very large compared to Java's built-in serialization. The
reason
is that your test puts many of the same objects in the graph more than once. By
default Kryo doesn't handle references. Normally this is done using
ReferenceFieldSerializer, but this mechanism can only handle types that would
otherwise use FieldSerializer. It doesn't handle primitive wrapper references,
Collections, etc. If you take the references out of your test, you will see an
efficient output size.
Kryo needs better references support. I'll update this issue when this is
implemented.
Further optimizations could be done for this test case by more efficient
handling of
Doubles. Right now they are always 8 bytes.
Original comment by nathan.s...@gmail.com
on 11 May 2010 at 6:06
Original comment by nathan.s...@gmail.com
on 11 May 2010 at 6:07
[deleted comment]
[deleted comment]
Thanks for the quick reply Nathan!
Your answer explains the problem with using compression. I am however still
seeing
another problem when enabling compression in the attached smaller program I
get:
Exception in thread "main" java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at com.esotericsoftware.kryo.Compressor.readObjectData(Compressor.java:92)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:474)
at
com.esotericsoftware.kryo.serialize.CollectionSerializer.readObjectData(Collecti
onSer
ializer.java:113)
at com.esotericsoftware.kryo.Compressor.readObjectData(Compressor.java:102)
at com.esotericsoftware.kryo.Serializer.readObject(Serializer.java:58)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:493)
at Test1.main(Test1.java:48)
But when no compression it used it runs ok.
------------------
In general I think Kryo looks like a very promising component. The things that
would
need to be fixed to make it useful for me (with my current use-case) would be:
1. None recursive reference handling (the fact that Java Serialization results
in
stack overflow on deep structures was my original reason to look for an
alternative)
2. Detection and handling of self referencing structures
3. The optimizations and missing features you mentioned in the reply.
Are you by the way interested in bringing more developers on-board to help out
with
improving Kryo? I will have a good lock at the code base and if I manage to
understand it I may be able to help out...
/JavaFanboy
Original comment by javafan...@gmail.com
on 12 May 2010 at 6:16
Attachments:
Your exception occurs because DeflateCompressor uses a temporary buffer which
is not
large enough. Please see the two argument DeflateCompressor constructor. The
error
message is very poor. I will provide a better message.
The deflate algorithm can process in chunks, so does not require a large
buffer. The
DeflateCompressor does not process in chunks (for easy implementation) and
should be
rewritten to be more efficient. There is already a comment at the top of
DeflateCompressor noting this, but I haven't gotten around to it. :)
One issue you may not be aware of with your test case. Kryo supports compressing
objects that are not the root of the graph. You have an ArrayList of
ArrayLists. You
register DeflateCompressor for all ArrayLists, which means that the bytes for
each
ArrayList in the graph will be compressed individually, and then the ArrayList
that
is the root object will cause all the bytes to get compressed. It would be more
efficient to only compress the root object. If you really have the need to
compress
the root object and not objects of the same type in the middle of an object
graph,
maybe the Compressor class needs a setting to support this.
Sorry to respond to each of your issues by saying, "yeah, Kryo needs to be
improved
there"! Your ability to find where Kryo could be augmented is uncanny. ;)
---------
1. Kryo serializers currently use stack based recursion because it is easy and
efficient to implement. It is rarely a problem and generally increasing the
stack
size with -Xss is sufficient. If this workaround is unacceptable, eg in a
multithreaded environment, Kryo serializers are pluggable. A new one could be
written
that uses heap based recursion. If you take a crack at this, you may want to
model it
after FieldSerializer.
2. You may be able to get by using ReferenceFieldSerializer until reference
support
for all objects is implemented.
I am all for community contributions! Just post a patch and I will merge it
into the
core if it is acceptable. I tend to work on Kryo in spurts. Unfortunately right
now I
have many other pressing projects, so it may be a week or two before I can
provide it
my full attention.
Original comment by n4ted...@yahoo.com
on 12 May 2010 at 7:02
Thanks again for the VERY quick answer Nathan - are you a night owl or like me
located in Europe perhaps?
I will play around with my tests a bit more to see if I can make them work.
As for finding all the areas that needs improvement I suppose the explanation
is that
I have done a lot of "special purpose" serialization stuff over the years and
knows
what is hard/complicated to do so those are the things I am looking for
solutions to
in a serialization component that I evaluate - then it is no miracle that some
of the
things do not work (yet!) - and I only talk about the tests that DO NOT work :-)
If I find enough time and inspiration I will give it a go to understand how the
Kryo
code works and then I will get back to you to hint on what I will try to
improve (so
we can avoid duplicate efforts).
/JavaFanboy
Original comment by javafan...@gmail.com
on 12 May 2010 at 7:17
I'm a night owl. Possibly more of a vampire.
I hope you'll find Kryo's source pretty straightforward. The Serializer
interface
defines some basic methods to read/write objects. Classes implementing this can
be
used on their own to do serialization. The Kryo class acts as a repository, so
the
various serializers can be used for graphs. There isn't much more to it.
One architectural choice that may be an issue, since you mentioned extremely
deep
graphs, is the use of ByteBuffer rather than streams. There is currently a
thread on
the discussion group about this.
Original comment by n4ted...@yahoo.com
on 12 May 2010 at 7:22
Well buffers may not necessary be the wrong way to go. Deep structures do not
NEED to
produce huge results (if the issues we have talked about with references are
solved).
/JavaFanboy
Original comment by javafan...@gmail.com
on 12 May 2010 at 7:28
I made some additions / changes that allow collections to be serialized only
once and
additional references to be saved as a reference. I followed the same line of
implementation as is already used for FieldSerializer/ReferenceFieldSerializer
and
created a ReferenceCollectionSerializer. I refactored out the References class
to a
separate file and added a new protected method (as in FieldSerializer) that is
used
by ReferenceCollectionSerializer.
Feel free to make whatever changes you feel apropriate to the code to make it
fit in
better!
/JavaFanBoy
Original comment by javafan...@gmail.com
on 13 May 2010 at 1:01
Attachments:
[deleted comment]
Original comment by nathan.s...@gmail.com
on 10 Oct 2010 at 2:51
v2 has proper support for references.
Original comment by nathan.s...@gmail.com
on 17 Apr 2012 at 10:21
Original issue reported on code.google.com by
javafan...@gmail.com
on 11 May 2010 at 6:44