Open GoogleCodeExporter opened 9 years ago
For reference, please see this cld thread:
http://groups.google.com/group/closure-library-discuss/browse_thread/thread/3e9b
b70bbd5889fd
Original comment by ahochh...@samegoal.com
on 15 Sep 2011 at 5:50
Original comment by pall...@google.com
on 13 Oct 2011 at 8:04
Peter, you mentioned you wanted to fix this in the thread. Are you still
interested in fixing it?
Original comment by chrishe...@google.com
on 9 May 2012 at 11:10
I'm a bit worried about fragmentation. Changing the JS implementation alone is
not enough: the Python, C++ and Java parsers also have to updated to understand
0 and 1.
Saving 3 or 4 bytes on the wire doesn't compensate the hassle.
If you need more compact wire format, PbLiteSerializer is the best solution.
Original comment by pall...@google.com
on 10 May 2012 at 9:13
Hi Peter, Thanks for your comments.
I agree that changing the JS alone is not enough (for most use cases). If
Google were to open source their plugins/customized protoc I would happily
provide patches for Py/C++/Java. Without access to that code, I would add
support to my open source C++ (de)serialization plugins
(http://code.google.com/p/protobuf-plugin-closure/).
One way to not fragment the formats in the long term:
1) Update all of the deserializers (JS/Py/C++/Java) to support both boolean encoding formats (on the fly -- without application level config necessary)
2) Wait N months until all code using ObjectSerializer to deserialize in all languages has been re-compiled and deployed
3) Update all of the serializers (JS/Py/C++/Java) to use the numeric boolean encoding
4) Wait N months until all code using ObjectSerializer to serialize in all languages has been re-compiled and deployed
5) Update all of the deserializers to only understand the numeric boolean encoding
Steps 4 and 5 are optional depending on how clean a deployment is desired.
Alternatively, the change could be done in a single step by updating any
serializers to add an off-by-default option to use the numeric encoding and
update any serializes to understand both formats. Then individual projects
could opt-in to the numeric boolean encoding as they saw fit (once they knew
that all components of their system supported the numeric format). If you are
willing to consider any of these deployment plans, I will update this change
per your direction.
In terms of using ObjectSerializer vs. PbLiteSerializer, I think the most
important factor to consider is if your messages contain a larger number of
sparsely populated fields. For example, {100:true} is more efficient than [,,,,
... ,,,,1]. However, the cost savings of numeric boolean encoding could apply
to both formats and depending on your use case could be a material savings.
Granted, this is a contrived case, but I think the core of the argument still
holds. Additionally, applications can always work around this limitation by
using an int encoding where they really want booleans, but that really isn't
ideal either.
From my perspective, it seems like a series of decisions that add 3-4 bytes per
field could stack up to have a material impact (at least for some use cases).
(Given how PbLiteSerializer is written I think someone else at Google might
agree with this viewpoint.) Since this is the library level, it is a chance to
get that savings for all applications without them needing to worry about the
wire encoding.
At the end of the day, I can always just subclass ObjectSerializer and add
support to my (de)serializtion plugin so this isn't a show stopper for me. Feel
free to mark as "Won't Fix" if the cost does not justify the deployment expense
in your opinion.
Thanks!
-Andy
Original comment by ahochh...@samegoal.com
on 10 May 2012 at 4:46
Original issue reported on code.google.com by
ahochh...@samegoal.com
on 15 Sep 2011 at 2:05