Closed antongrbin closed 1 year ago
The proto2 behavior definitely seems like a bug. As you mentioned, proto2 doesn't officially have a JSON spec defined and so pbandk doesn't officially support JSON with proto2. But we do try to match the behavior of the official protobuf libraries where possible. We would gladly accept a PR with a fix.
The proto3 behavior I would argue is compliant with the spec. According to https://developers.google.com/protocol-buffers/docs/proto3#json:
If a field has the default value in the protocol buffer, it will be omitted in the JSON-encoded data by default to save space. An implementation may provide options to emit fields with default values in the JSON-encoded output.
pbandk does have the JsonConfig.outputDefaultValues
option available if you want the output to be {"value":"FOO"}
.
Thank you for the quick turnaround time :)
Happy to hear that we can move forward with the fix for proto2
.
If you agree, please review https://github.com/streem/pbandk/pull/234 which only adds unit tests that show the current behavior. This will make our follow-up fix PR clearer.
--
I looked into implementing the fix for proto2
and it looks like most effort will go into determining whether the message is proto2
or proto3
with the information that JsonMessageEncoder
has in runtime. Did I miss anything here?
A simpler approach is to say that field presence
is what's important, not proto2
vs proto3
. In that approach we would change the behavior of all optional enum
fields with default values, regardless of whether they are in proto2
or proto3
.
Digging deeper, I came across Application note: Field presence which suggests that our fix here might not be enum-specific. From that doc:
* No presence discipline:
* Default values are not serialized.
* Explicit presence discipline:
* Explicitly set values are always serialized, including default values.
This suggests that we should serialize default values of all optional fields (not only enums).
--
Having all of the above, we can apply the fix with 3 scopes:
1) fix serialization of default values for proto2
enums
2) fix serialization of default values for all optional enums
3) fix serialization of default values for all optional fields
My hunch would be to move forward with 3) since it makes the code cleaner and is more aligned with other implementations. I can provide more details here if needed :)
Looking forward to your input!
fix serialization of default values for all optional fields
I agree that this would be the preferred route.
In general, it's better to avoid writing code that makes decisions based on proto2 vs proto3 (when possible) and instead write code that makes decisions based on the field's presence discipline (as described in the doc you linked to).
This suggests that we should serialize default values of all optional fields (not only enums).
Yes agreed. Just keep in mind the distinction between "no presence" (the default behavior for non-repeated proto3 fields) and "explicit presence" (used in proto2 and proto3 for fields that include the optional
keyword).
Excellent. Thank you for the guidance!
@garyp will you be the reviewer for the changes? This will be 2 small PRs, first one with a unit test and second one with a fix.
The first one is here: https://github.com/streem/pbandk/pull/234
Yep, I'll review them. I should have time later today to take a look at the PRs.
Thank you for the review on the unit test PR.
The fix PR is here: https://github.com/streem/pbandk/pull/238
@antongrbin Thank you for the thorough fix and tests!
@garyp is there anything blocking the 0.14.2 release? Milestone looks all closed: https://github.com/streem/pbandk/milestone/12
Thank you for your time :)
@antongrbin It's released now!
The testing message is defined as follows:
We observe the following serialization with pbandk:
MessageWithEnum()
{"value":null}
{}
MessageWithEnum(value=FOO)
{}
{"value":"FOO"}
MessageWithEnum(value=BAR)
{"value":"BAR"}
The bug is reproduced in a unit test here: https://github.com/streem/pbandk/pull/234
I know that proto2 doesn't really have a JSON spec defined, but other protobuf implementations (java, python, swift) would all behave as expected in this example.
In proto3, we observed the following serialization behavior:
TestAllTypesProto3()
{}
TestAllTypesProto3(optionalNestedEnum = FOO)
{}
{"value":"FOO"}
TestAllTypesProto3(optionalNestedEnum = BAR)
{"value":"BAR"}
1) Do you agree that the current behavior is unexpected? 2) Would you be open for us contributing a fix for it?
Thank you for your time :)