Support ByteBuffer in CodedInputStream and CodedOutputStream to avoid redundant copies for direct ByteBuffer.

zf8848 / protobuf

Automatically exported from code.google.com/p/protobuf

Other

0 stars 0 forks source link

Support ByteBuffer in CodedInputStream and CodedOutputStream to avoid redundant copies for direct ByteBuffer. #429

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago

The generated java message classes today contains overloaded static 
parseFrom(...) method thats a byte[],ByteString as parameter etc...
and the corresponding mergeFrom in the Builder class.

I think it would useful to support javas ByteBuffer or a byte[] and a 
startOffset and endOffset, the underlying implementation would just read the 
array of ByteBuffer to limit, in the case of ByteBuffer, and endOffset in the 
other case. The reason is the ByteBuffer could be reused over and over by 
resetting and setting the limit.
hopefully this change is not a big thing.

so the proposed change would be adding parseFrom(ByteBuffer data); and 
parseFrom(byte[] data,int startOffset, int endOffset), and the corresponding 
mergeFrom methods in the Builder class.

Original issue reported on code.google.com by yijinzho...@gmail.com on 31 Oct 2012 at 8:37

GoogleCodeExporter commented 9 years ago

Parsing methods like parseFrom(byte[] data, int off, int len) is already 
available. See:
http://code.google.com/p/protobuf/source/browse/trunk/java/src/main/java/com/goo
gle/protobuf/Parser.java#127

Original comment by xiaof...@google.com on 1 Feb 2013 at 5:55

Changed state: Invalid

GoogleCodeExporter commented 9 years ago

Reopened this issue and changed the title to be more specific. The interface of 
messages and builders is already fixed and shouldn't be changed easily. Adding 
ByteBuffer support in CodedInputStream and CodedOutputStream should be good 
enough to support direct ByteBuffers efficiently.

Original comment by xiaof...@google.com on 2 Feb 2013 at 9:42

Changed title: Support ByteBuffer in CodedInputStream and CodedOutputStream to avoid redundant copies for direct ByteBuffer.
Changed state: New
Added labels: Type-Enhancement, Priority-Low
Removed labels: Type-Defect, Priority-Medium

GoogleCodeExporter commented 9 years ago

ah my version of protobuf didnt contain the Parser.java (got mine just before 
that class was released externally)
About adding ByteBuffer support in CodedInputStream and CodedOutputStream,since 
ByteBuffers are often used in the context of NIO, both NIO and NIO2 uses 
ByteBuffer to do asynchronous read and write operations, shouldnt the support 
for ByteBuffer be somewhere else like in the new Parser.java class rather than 
the synchronous XXXStream Classes?

Original comment by yijinzho...@gmail.com on 4 Feb 2013 at 7:36

GoogleCodeExporter commented 9 years ago

Original comment by jie...@google.com on 17 Jul 2014 at 6:15

Changed state: Fixed
Added labels: FixedIn-2.6.0

GoogleCodeExporter commented 9 years ago

Original comment by jie...@google.com on 17 Jul 2014 at 7:16

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

This is not fixed. CodedInputStream makes a copy of the entire buffer if using 
native buffers, which could be expensive in cases. The title indicates an issue 
of "redundant copies" and creating an entire copy of a ByteBuffer seems like 
this.

I suppose it could be managed by wrapping the ByteBuffer as an InputStream like 
the CodedOutputStream does.

Original comment by harni...@gmail.com on 15 Aug 2014 at 9:37

GoogleCodeExporter commented 9 years ago

We tried to implement a CodedInputStream which reads directly from a direct 
ByteBuffer but it turns out to make the parsing performance worse (i.e., copy 
the whole content to byte[] then parsing from byte[] is better than parsing 
from ByteBuffer directly). The fact is, ByteBuffer.get() is several times 
slower then byte[i] and as in protobuf we do a lot of varint decoding (which 
reads byte by byte), making a copy to byte[] gives us the better performance in 
almost all cases.

As a result we are not pursing avoiding this seemingly "redundant copy" in 
CodedInputStream. If you have a particular proto which only contains several 
large bytes field, wrapping the ByteBuffer in an InputStream might work better 
for you. But for general use, the current implementation should be the right 
approach.

Original comment by xiaof...@google.com on 15 Aug 2014 at 10:56