gaob13 / kryo

Automatically exported from code.google.com/p/kryo
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Kryo makes pointless allocations when serializing private primitive fields #130

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Serialize an object with private boolean fields

What is the expected output? What do you see instead?
There should be no new instances of Boolean on the heap - the two cached 
instances should always be reused. Instead, Kryo allocates new instances of 
Boolean.

What version of the Kryo are you using?
This occurs on the latest version in source control as well as 2.21.

Please provide any additional information below.

If you use Field.getBoolean() on a primitive boolean field and cast to Boolean, 
the returned object will be one of the two cached Boolean objects already in 
the heap. However, if you use Field.get() and cast to Boolean, the returned 
object will actually be a brand new Boolean instance. This behavior is 
demonstrated in the following gist (the test passes): 
https://gist.github.com/jordanlewis/6413827

Kryo falls back on Field.get() in FieldSerializer.java when it can't use ASM or 
Unsafe, such as in the case of private fields. Ideally, it would instead use 
Field.getBoolean (or getLong, etc) for fields of primitive type. Instead, it 
ends up creating a lot of pointless garbage because of this reflection quirk.

I wrote a patch that fixes ObjectField to detect the type of the field before 
serializing, and to use getBoolean/getLong/etc when appropriate. ObjectFields 
now take the type of the field that they represent in their constructor to 
facilitate the correct behavior.

Original issue reported on code.google.com by jordanth...@gmail.com on 2 Sep 2013 at 3:19

Attachments:

GoogleCodeExporter commented 8 years ago
Thanks for reporting it. Your observation is correct and your proposed solution 
makes sense.
I committed changes into trunk that improve the situation. It is similar to 
what you propose, but avoids dispatching on type at each get operation, because 
it is very costly on the hot-path.

BTW, could you give us a hint in which environments you cannot use ASM or 
Unsafe? Is it on Android?

> Kryo falls back on Field.get() in FieldSerializer.java when it can't use ASM 
or Unsafe, such as in the 
> case of private fields. 

Hmm. IMHO, Unsafe does not have any problems with private fields. Have you 
tried it? If it does fail, could you provide a test-case?

Please let us know if my fix solves the problem you reported, so that we can 
close this issue.

-Leo

Original comment by romixlev on 3 Sep 2013 at 8:59

GoogleCodeExporter commented 8 years ago
Thanks Leo! The fix looks good.

I'm not on Android - just regular old Sun JVM. I might be incorrect about 
Unsafe not working on private fields. You would be more qualified to understand 
this point. What I noticed is just that serializing a class (using default Kryo 
settings) with private boolean fields causes this behavior - so I used a 
debugger and found out that FieldSerializer was being used on that private 
field, which from what I can tell means that neither ASM nor Unsafe were being 
used.

Original comment by jordanth...@gmail.com on 3 Sep 2013 at 1:54

GoogleCodeExporter commented 8 years ago
When I try to reproduce on 2.21, the field always uses the reflection-based 
serializer. On trunk, the Unsafe serializer is used. There must have been an 
improvement between 2.21 and trunk that causes even private boolean fields like 
I'm describing to use the Unsafe field serializer.

Oh well! When will you cut the next release? It would be great to get this 
improvement into my application. By the way, I noticed this issue when 97 
million instances of Boolean were floating around on my heap!

Original comment by jordanth...@gmail.com on 3 Sep 2013 at 2:20

GoogleCodeExporter commented 8 years ago
Yes, Unsafe-based approach will be officially introduced in the 2.22 release. 
And it first appeared in the 2.22-SNAPSHOT; IIRC.

Regarding the plans for a next release (2.22):
 Since this release introduces so many new features, especially Unsafe-based FieldSerializer and Unsafe-based streams, we are very cautious about testing it properly. The more feedback we get from user, the more confident we are.

We also fixed a lot of bugs recently, but there are still a few issues to be 
solved.

P.S. If you need more performance, you may want to try UnsafeInput/UnsafeOutput 
classes from the current snapshot instead of the usual Input/Output. Depending 
on your classes, you may experience an order of magnitude speed-up.

Original comment by romixlev on 3 Sep 2013 at 3:03

GoogleCodeExporter commented 8 years ago

Original comment by romixlev on 4 Sep 2013 at 12:08