jyt109 / protobuf-java-format

Automatically exported from code.google.com/p/protobuf-java-format
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Unknown fields violate JSON syntax #47

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Create a proto and add unknown fields (using UnknownFieldSet.Builder) to it 
with wire type fixed32 or fixed64
2. Serialize the proto to JSON using JsonFormat.printToString()
3. Inspect the output string

What is the expected output? What do you see instead?

The output will contain:

"9000": [0x00002328], "9001": [0x0000000000002329]

In this case, there is an unknown fixed32 field with tag 9000 and value 9000 (= 
0x2328), and an unknown fixed64 field with tag 9001 and value 9001 (= 0x2329).

However, this output is not valid JSON. Numbers in JSON can only be represented 
in decimal form.

If we attempt to parse this output using another JSON library (such as GSON), 
it may fail outright due to the bad syntax, but even if it is lenient enough to 
perform the conversion, there may not be a way to determine whether the value 
in the input stream was "0x" + 8 digits or "0x" + 16 digits. In such a case, 
there would be no way to determine which wire type was originally used to store 
the value (fixed32 or fixed64), which means we can't reconstruct the proto in a 
way that would allow us to reserialize it in binary form without loss.

There are important use cases for representing unknown fields losslessly. For 
example a server may hand an object to a client that only knows about an older 
version of the proto. Then the client may update some fields it knows about, 
and hand the object back to the server. Fields of the proto that are unknown to 
the client should not be disturbed. The ability to handle this kind of scenario 
seamlessly, without the need to synchronously update all consumers of a proto 
definition, is a key advantage of protocol buffers.

In effect, serialization of unknown fields is only lossless when they will be 
read by another instance of protobuf-java-format. This negates the value of 
using a standard format like JSON. Instead, the values should be encoded in 
such a way that only standard JSON constructs are used, and the wire type is 
preserved, for example:

"9000":[{"fixed32":9000},{"fixed64":9001}]

What version of the product are you using? On what operating system?

1.2.

Original issue reported on code.google.com by r...@squareup.com on 1 Aug 2013 at 11:54

GoogleCodeExporter commented 9 years ago
Another option to represent the fields:

"9000":[[1,2,3,4],[1,2,3,4,5,6,7,8]]

A length-4 array would encode a fixed32 and a length-8 array would encode a 
fixed64.

Original comment by r...@squareup.com on 2 Aug 2013 at 2:05