ruiaylin / goprotobuf

Automatically exported from code.google.com/p/goprotobuf
Other
0 stars 0 forks source link

Incorrect serialization of fields with tags > (2^28 - 1) -- 32-bit signed integer overflow #19

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Define a Protocol Buffers message with a required field that has tag > (2^28 
- 1), e.g. 484589701
2. Initialize and serialize (marshall) the message in Go
3. Try de-serializing the message in Go/C++/Python/Java... and observe the error

What is the expected output? What do you see instead?

Instead of being successfully de-serialized, a "missing required field" error 
will be generated due to the bug in Go serializer that generated an incorrect 
tag for the field. The generated tag representation is 1 byte-long instead of 
being 5 bytes long. A field with incorrectly generated tag could be also 
confused with another valid field that happens to have the same 1-byte tag 
representation.

What version of the product are you using? On what operating system?

tip, 64-bit Linux

A patch that fixes the problem is in the attachment.

Original issue reported on code.google.com by alav...@piqi.org on 28 Sep 2011 at 6:08

Attachments:

GoogleCodeExporter commented 9 years ago
Quite the data structure you have there. I rarely see more than a few dozen 
fields in a struct and here you are pushing a billion. 32-bit integers will be 
a problem if you decide to go for a trillion. Where will it stop?

Thinking needed.

Original comment by r@golang.org on 28 Sep 2011 at 3:57

GoogleCodeExporter commented 9 years ago
It's definitely an extreme example. Tag numbers are permitted to be up to 2^29 
- 1 (c.f. http://code.google.com/apis/protocolbuffers/docs/proto.html#simple), 
so it seems like a simple bug on our part. I'll take a look.

Original comment by dsymo...@golang.org on 28 Sep 2011 at 4:07

GoogleCodeExporter commented 9 years ago
The patch should fix it (thanks for that).  I am curious why the field number 
is so crazy, though.

Original comment by robp...@gmail.com on 28 Sep 2011 at 4:12

GoogleCodeExporter commented 9 years ago
Rob, as dsymonds pointed out, it is a matter of the implementation following 
the spec, which is explicit about the valid integer range for the tags.

Speaking of practical matters, I agree that having such big tag numbers 
assigned by hand is not reasonable. 

However, is my case, field tags are sometimes generated automatically by 
hashing field names into 29-bit integers. I was playing with Go and Goprotobuf 
library and discovered that it fails to handle certain cases.

The reason for using hash-based tags is the ability to automatically assign 
them in anticipation of arbitrary future extensions to the original 
specification. The idea is that extensions will be made by various parties 
independently without the need to coordinate between each other and decide on 
the tag code ranges.

I use this method for Piqi self-specification in my research project 
(http://piqi.org).

Original comment by alav...@piqi.org on 28 Sep 2011 at 4:30

GoogleCodeExporter commented 9 years ago
Thanks for the info.  For your info, small tag numbers are much more efficient.

Original comment by r@golang.org on 28 Sep 2011 at 5:02

GoogleCodeExporter commented 9 years ago
This issue was closed by revision add41b3cc667.

Original comment by dsymo...@golang.org on 28 Sep 2011 at 5:56