jhump / protoreflect

Reflection (Rich Descriptors) for Go Protocol Buffers
Apache License 2.0
1.35k stars 172 forks source link

error handling protobuf option (grpc.gateway.protoc_gen_swagger.options.openapiv2_schema) #549

Closed baganokodo2022 closed 1 year ago

baganokodo2022 commented 1 year ago

What version of protobuf and what language are you using? Version: v1.5.2

What did you do? In one project, a simple protobuf message was created with a message-level option like this,

syntax = "proto3";

package org.order.common;

import "google/protobuf/descriptor.proto";
import "protoc-gen-swagger/options/openapiv2.proto";

message Order {
  option (grpc.gateway.protoc_gen_swagger.options.openapiv2_schema) = {
    json_schema: {
      title: "SimpleMessage"
      description: "A simple message."
      required: ["id"]
    }
  };

  string id = 1; 
}

The protoc-generated Go code was imported into a different project, which invokes a protobuf serializer in confluent-kafka-go. The serializer internally restores the protobuf text definition through reflection, then registers the canonicalized proto definition onto the Confluent Schema Registry. This real-time, dynamic protobuf schema registration is very desired to ensure schema-evolution backward compatibility in data pipelines.

In confluent-kafka-go, it makes use of jhump/protoreflect to restore the protobuf text schema through FileDescriptor. During this process, I found this line in golang/protobuf was invoked when the mentioned option in handled. As a result, the generated protobuf look like this,

syntax = "proto3";

package org.order.common;

import "google/protobuf/descriptor.proto";
import "protoc-gen-swagger/options/openapiv2.proto";

message Order {
  option (grpc.gateway.protoc_gen_swagger.options.openapiv2_schema) = {
    json_schema: <
      title: "SimpleMessage"
      description: "A simple message."
      required: ["id"]
    >
  };
  string id = 1; 
}

Clearly, the angle brackets <> break the syntax of protobuf 3.0. Screenshot 2023-02-16 at 2 43 35 PM

What did you expect to see? The text_encode.go should correctly restore the protobuf schema as,

    json_schema: {
      title: "SimpleMessage"
      description: "A simple message."
      required: ["id"]
    }

What did you see instead? The text_encode.go generated the protobuf option with a broken syntax,

    json_schema: <
      title: "SimpleMessage"
      description: "A simple message."
      required: ["id"]
    >

The same issue was reported to confluent-kafka-go and golang/protobuf, https://github.com/confluentinc/confluent-kafka-go/issues/946 https://github.com/golang/protobuf/issues/1523

Responses from golang/protobuf articulated that,

Note that github.com/golang/protobuf/proto is deprecated, and the implementation of text serialization in this package is frozen. The google.golang.org/protobuf/encoding/prototext package is the recommended implementation of the text (de)serialization.

The prototext package does not use </> brackets at this time.

Can some one take a look at this blocking issue for us?

Many thanks

Xinyu Liu

jhump commented 1 year ago

@baganokodo2022, the angle brackets are allowed in the Protobuf language. If you try compiling the source with protoc or buf, you'll find that they both accept it. This issue is more likely to be in whatever is processing those sources.

FWIW, there are changes to the protoprint package in the latest pre-release: v1.15.0-rc1. The latest version uses curly braces instead of angle brackets (not because angle brackets are invalid but because curly braces are more common and more consistent with the rest of the Protobuf language).

baganokodo2022 commented 1 year ago

I ran a quick test and verified that the release of v1.15.0-rc1 has fixed the aforementioned issue, in the meantime, it seems introduced a different issue. The same proto schema was repeatedly registered in Confluent Schema Registry, and generated 24 versions with the exact same content. Digging deeper, I found the cause of repeated registrations is because a list of options are shuffled randomly in each schema version registered. For other proto schemas referencing the schema with the option list, they had the version bumped each time as well.

jhump commented 1 year ago

@baganokodo2022, it looks like v1.10.0 introduced a bug that would cause elements with multiple options to print options in random order, unless other sort configuration was defined on the printer struct (e.g. SortElements: true or CustomSortFunc was defined) or the descriptor had source code info for the options (in which case, they would be ordered in the way they appear in the original source, according to the source code info).

That has been fixed as of #550 and is available in v1.15.0-rc2.

baganokodo2022 commented 1 year ago

thank you so much for the quick response @jhump