golang / protobuf

Go support for Google's protocol buffers
BSD 3-Clause "New" or "Revised" License
9.64k stars 1.58k forks source link

encoding/protojson: EmitUnpopulated doesn't emits unpopulated proto3 optional scalar fields #1525

Closed toga4 closed 1 year ago

toga4 commented 1 year ago

What version of protobuf and what language are you using?

What did you do?

marshal proto to json

protobuf:

syntax = "proto3";

package com.example.foo;

option go_package = "example.com/main";

message Foo {
  optional string foo = 1;
  string bar = 2;
}

generate go file:

protoc foo.proto --go_out . --go_opt paths=source_relative

marshal:

package main

import (
    "fmt"

    "google.golang.org/protobuf/encoding/protojson"
)

func main() {
    obj := &Foo{Foo: nil}
    marshaller := &protojson.MarshalOptions{EmitUnpopulated: true}
    b, _ := marshaller.Marshal(obj)
    fmt.Printf("%s\n", b)
}

What did you expect to see?

The type of optional field foo is *string and it's zero value is nil. If EmitUnpopulated is true then we expect it emits null value.

{"foo":null,"bar":""}

What did you see instead?

{"bar":""}
puellanivis commented 1 year ago

Hm, it does seem like there is a contradiction here, where proto2 scalar fields are defined to emit null, while ̀proto3 optional scalar fields are not. Either way could make sense, it could be copying the behavior of proto2, but since not appearing is defined as equivalent to null, in proto3, there isn’t really a reason why it should have to be emitted. But then, message fields are emitted with null as well, so…

I cannot find any definitive answer as to what the correct behavior is in the standard, and there does not appear to be any open or closed issues pertaining to this issue in the main protobuf issues board. (The only one appearing from this search is a desire to add the ability to emit unpopulated fields into PHP.)

This seems like it is at the very least a bug in documentation. We should spell out explicitly the behavior of proto3 optional scalar fields in the documentation. However, whether this is a bug in behavior, I don’t know that this project could unilaterally declare, but I think we might want to consider it to be.

The ideal path forward is that you could open an issue on the main protobuf board (a link given above), and ask for clarification of what the behavior here should be. After all, if it is unequivocally said that we should be emitting them as null, then that’s a firm definitive answer from the people managing the standard. Meanwhile, if someone wants to pre-emptively start a CR for the new API, I don’t think it it would be immediately shot down.

dsnet commented 1 year ago

I believe the problem under the hood is that proto3 optional is implemented under the hood as a oneof with a single type. IIUC, from the perspective of protobuf reflection, this is still a oneof field, not an optional field.

The EmitUnpopulated option is a carry over from v1 and doesn't have a direct correlation with the other language implementations. I believe the closest equivalent is the always_print_primitive_fields in C++ and the includingDefaultValueFields option in Java. We should verify what those two options do with regard to proto3 optional fields.

puellanivis commented 1 year ago

¿I’m pretty sure, as mentioned, the optional fields are implemented as pointer to type?

The type of optional field foo is *string and it's zero value is nil.

Pasting in the relevant lines of the .pb.go file should be definitive?

toga4 commented 1 year ago

Yes, the optional fields are implemented as pointers to types. And yes, this is the point I missed, it is annotated as a oneof field as mentioned.

type Foo struct {
    state         protoimpl.MessageState
    sizeCache     protoimpl.SizeCache
    unknownFields protoimpl.UnknownFields

    Foo *string `protobuf:"bytes,1,opt,name=foo,proto3,oneof" json:"foo,omitempty"`
    Bar string  `protobuf:"bytes,2,opt,name=bar,proto3" json:"bar,omitempty"`
}

The EmitUnpopulated is clearly mentioned it doesn't emit unpopulated oneof fields, so it seems expected behavior, but...um... feels a little odd to me. https://github.com/protocolbuffers/protobuf-go/blob/bc1253ad37431ee26876db47cd8207cdec81993c/encoding/protojson/encode.go#L68-L82

dsnet commented 1 year ago

In terms of the generated API, they are indeed *T, but my point is that from the perspective of protobuf reflection, they are a oneof. The protojson package is implemented in terms of protobuf reflection, not what the generated API looks like. Granted, the implementation of protojson can explicitly check the protoreflect.Oneof.IsSynthetic flag to see if this is a proto3 optional field. Whether it should do this or not is IMO dependent on whether the C++ and Java implementations do the same thing.

toga4 commented 1 year ago

I confirmed C++ and Java implementations. As it turns out, C++ and Java have the same behavior as current protojson. So protojson's behaviour is it should be. Thanks for discussing this, now have a better understanding of protobuf ;) I'll close this.


C++

#include <google/protobuf/util/json_util.h>
#include "foo.pb.h"

int main()
{
  std::string json_string;
  com::example::foo::Foo foo;

  google::protobuf::util::JsonPrintOptions options;
  options.always_print_primitive_fields = true;
  MessageToJsonString(foo, &json_string, options);

  std::cout << json_string << std::endl;
}

Executed this and got the following results.

% c++ -std=c++14 json.cc foo.pb.cc `pkg-config --cflags --libs protobuf` && ./a.out
{"bar":""}

Java

package com.example.foo;

import com.example.foo.FooProtos.Foo;
import com.google.protobuf.util.JsonFormat;

public class App {
    public static void main(String[] args) throws Exception {
        Foo foo = Foo.newBuilder().build();
        String out = JsonFormat.printer().includingDefaultValueFields().print(foo);
        System.out.println(out);
    }
}

Executed this and got the following results.

{
  "bar": ""
}
dsnet commented 1 year ago

@toga4, I recommend raising this discrepancy between proto2 and proto3 optional up at https://github.com/protocolbuffers/protobuf.

Just because C++ and Java act this way doesn't necessarily mean it's intended behavior. If the protobuf team decides that the behavior should be altered, then this module will follow suit.

toga4 commented 1 year ago

Okay, I'll try it.