Nov11 / kryo

Automatically exported from code.google.com/p/kryo
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Avoid two System.arraycopy when serializing+deserializing StringBuilder #72

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
DefaultSerializers.StringBuilderSerializer performs toString() conversion when 
writing and copy-from-String constructors when reading.

That internally implies one System.arraycopy invocation at each side.

I propose you the following:
1. Change the Output.writeString signature to accept a CharSequence parameter), 
no other change is necessary inside the method.
2. Change the StringBuilderSerializer.write method with this:
        public void write (Kryo kryo, Output output, StringBuilder object) {
            output.writeString(object);
        }
3. Overload or make new methods equivalent to Input.getString and 
Input.readUtf8 to allow generaing StringBuilder directly from the char[] 
buffer, instead a String.
4. Chenge the StringBuilderSerializer.read method in an equivalent way to avoid 
pivoting over a String instance.

By the way, the same approach sould be done, I propose with 
DefaultArraySerializers.CharArraySerializer in order to efficent store/read its 
contents as UTF-8.

P.S: The same shoudn't be done with StringBufferSerializer, I think, because 
the charAt method is synchronized.

P.S.2: Sorry for my English... :-(

Original issue reported on code.google.com by serverpe...@gmail.com on 18 Jun 2012 at 12:49

GoogleCodeExporter commented 8 years ago
This issue was closed by revision r299.

Original comment by nathan.s...@gmail.com on 18 Jun 2012 at 2:37

GoogleCodeExporter commented 8 years ago
Output#writeString(String) cannot be changed to 
Output#writeString(CharSequence) because the ASCII path needs String#getBytes. 
I added Output#writeString(CharSequence) though, which always writes UTF8. This 
is slightly faster than writeString(). The value can be read using 
Input#readString().

To avoid a new String when reading, I added Input#readStringBuilder(). This is 
slightly faster than readString() for UTF8. When reading ASCII written by 
writeString() or writeAscii(), readStringBuilder() is slightly slower than 
readString(). This isn't much on an issue though, because the most likely case 
is to write a StringBuilder and read a StringBuilder. The reason it is slower 
is because the readString() ASCII path is very fast by going directly from 
bytes to a String. readStringBuilder() can't use bytes, so has to convert to 
char[]. I benchmarked converting to char[] versus using readAscii() and the 
difference was negligible so I went with the solution that was less code.

Original comment by nathan.s...@gmail.com on 18 Jun 2012 at 2:37

GoogleCodeExporter commented 8 years ago
Great!

Original comment by serverpe...@gmail.com on 18 Jun 2012 at 5:40