Closed vbarua closed 4 months ago
buf is complaining about
proto/substrait/algebra.proto:895:9:Field "2" on message "UserDefined" moved from outside to inside a oneof.
I ran a couple of quick checks on this.
First, decoding a binary encoded version of the message against the old and new protobufs:
substrait git:(main) echo 8a023612340a2e747970652e676f6f676c65617069732e636f6d2f676f6f676c652e70726f746f6275662e496e74363456616c75651202082a | xxd -r -p | protoc --proto_path=proto --decode substrait.Expression.Literal proto/substrait/algebra.proto
user_defined {
value {
type_url: "type.googleapis.com/google.protobuf.Int64Value"
value: "\010*"
}
}
substrait git:(vbarua/user-defined-type-thonks) echo 8a023612340a2e747970652e676f6f676c65617069732e636f6d2f676f6f676c652e70726f746f6275662e496e74363456616c75651202082a | xxd -r -p | protoc --proto_path=proto --decode substrait.Expression.Literal proto/substrait/algebra.proto
user_defined {
value {
type_url: "type.googleapis.com/google.protobuf.Int64Value"
value: "\010*"
}
}
The decoded message is the same across both version. Checking the JSON, both the old and new protobufs generate the same JSON
{
"userDefined": {
"typeReference": 0,
"value": {
"@type": "type.googleapis.com/google.protobuf.Int64Value",
"value": "42"
},
"typeParameters": []
},
"nullable": false,
"typeVariationReference": 0
}
The rule it is triggering is FIELD_SAME_ONEOF.
This checks that no field moves into or out of a oneof or changes the oneof it's a part of. Doing so is almost always a generated source code breaking change. Technically there are exceptions with regard to wire compatibility, but the rules are complex enough that it's safer to never change a field's presence inside or outside a given oneof.
In this case, following the link shows that we have something similar to one of the exception cases:
Move fields into or out of a oneof: You may lose some of your information (some fields will be cleared) after the message is serialized and parsed. However, you can safely move a single field into a new oneof and may be able to move multiple fields if it is known that only one is ever set.
Effectively the issue that they are guarding against is moving a field into a oneof, and then if that field is set it will clear other messages in the oneof. In this case, as there is only 1 existing field (value
) and we are adding a new field (struct
), there should be no existing code that tries to set both.
IMO we can ignore this for this PR.
Context: https://substrait.slack.com/archives/C02D7CTQXHD/p1707328717290129
Inlining for posterity