google / gson

A Java serialization/deserialization library to convert Java Objects into JSON and back
Apache License 2.0
23.13k stars 4.27k forks source link

Read `json_name` from protobuf field annotations for serialization and deserialization #2700

Closed jabagawee closed 2 weeks ago

jabagawee commented 3 weeks ago

Background

We would like to have custom names for keys in JSON strings that do not match the names of the proto fields. Such a feature was added to the GSON library in 2015 via #710, which allowed developers to register custom field options in the ProtoTypeAdapter to read annotations for custom field names from the proto definition. Unfortunately some users use the json_name annotation described at https://protobuf.dev/programming-guides/proto3/#json, and it does not appear in the field options list in the proto field descriptor. Instead, it is set as its own first-class field in the field descriptor.

Design decisions

Backwards compatibility

One important criteria in making changes to GSON is backwards compatibility, because any changes we make must not break existing users. The first thing I want to establish is that the functionality of reading the json_name field should absolutely not be enabled by default. Doing so has the potential to break many users, some of whom would not even realize that such a thing was happening. As such, I believe that we should add a new boolean flag named something like shouldUseJsonNameFieldOption to the ProtoTypeAdapter and its Builder in order to configure whether or not the json_name field option should be read or not.

Interaction with other existing features

Assuming that we add such a shouldUseJsonNameFieldOption boolean, the next question is what should happen if it is turned on (in other words, set to true) in the presence of other features like field name serialization/deserialization formats and custom serialized name extensions.

Field name serialization and deserialization formats

ProtoTypeAdapter allows the user to set the expected "case format" of the names of both the proto and the JSON fields. By default, it expects proto field names to be written in lower_underscore and JSON field names to be written in lowerCamelCase. During serialization and deserialization, it converts between the "from" and "to" case formats. I propose that we should ignore the field name serialization format configuration when reading the json_name field option for three reasons:

  1. The json_name annotation is an explicit decision written by the user.
  2. This matches existing behavior from the Protobuf library's JsonFormat class in Java and implementations in other languages.
  3. This matches the behavior displayed when ProtoTypeAdapter is configured to read other custom serialized field names via its Builder's addSerializedNameExtension method.

Custom serialized name extensions

Due to the aforementioned PR #710, the ProtoTypeAdapter class already features a way to register a custom field option such that annotations are used to determine JSON field names during serialization and deserialization. If the user registers such a custom field option and also enables reading the json_name field, we need to determine what the ProtoTypeAdapter will do when it serializes and deserializes a field that features both annotations. I believe that we have several options here, though I personally advocate for one particular approach noted below.

Don't allow them both to be set at the ProtoTypeAdapter level

One option would be to throw an error when a ProtoTypeAdapter is built with both options configured. This may solve our issue by disallowing invalid states, but it may be excessively restrictive for users. A protobuf message definition can be effectively arbitrarily complex, and it's even possible for no single team to have ownership of every field all the way down the message definition hierarchy, so it's very possible that a protobuf may need both options to be set.

One counter-argument to this example would be that users should register a separate ProtoTypeAdapter for different subclasses of Message.class with different needs.

Don't allow them both to be set on a particular field

Instead of throwing an error at the point where a ProtoTypeAdapter is built, we could simply allow it and then throw an error when encountering a field that has both the specially registered field option and a json_name option as annotations on itself.

Allow both to be set but prefer the one that was set earlier in the ProtoTypeAdapter.Builder

This option leaves control to the user about what to do when both options are enabled in the ProtoTypeAdapter and both annotations are present on a particular proto field. Similar to how custom field options are chosen in the order that they are added in the Builder, we could track the order in which a call to setShouldUseJsonNameFieldOption was made relative to calls to addSerializedNameExtension and then read the annotation that corresponds to the earliest call for a given field. I am not a huge fan of this option because:

Allow both to be set but prefer one or the other

The last reasonable option would be to allow both options to be enabled in the ProtoTypeAdapter. When encountering a field for serialization or deserialization that features both annotations, we should prefer either:

Getbook1 commented 2 weeks ago

No no no json is my personal its not for public use

On Sat, Jun 22, 2024, 1:33 PM Éamonn McManus @.***> wrote:

Closed #2700 https://github.com/google/gson/issues/2700 as completed via #2701 https://github.com/google/gson/pull/2701.

— Reply to this email directly, view it on GitHub https://github.com/google/gson/issues/2700#event-13251679242, or unsubscribe https://github.com/notifications/unsubscribe-auth/BGSVGMPXGVLW7O4XI3UCVXTZIXNR7AVCNFSM6AAAAABJJPOGWOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTGI2TCNRXHEZDIMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>