Closed gudvinr closed 3 years ago
Note that for scalar message fields, once a message is parsed there's no way of telling whether a field was explicitly set to the default value (for example whether a boolean was set to false) or just not set at all: you should bear this in mind when defining your message types. For example, don't have a boolean that switches on some behaviour when set to false if you don't want that behaviour to also happen by default. Also note that if a scalar message field is set to its default, the value will not be serialized on the wire. [emphasis added] https://developers.google.com/protocol-buffers/docs/proto3#default
Since this is expected behavior and required by the standard, there is not really any chance that we would consider changing this behavior. It is always important to remember that a scalar message field does not have presence detection any more than any other scalar does. You cannot rely upon any semantic difference between nil
and &pb.Point{}
, even though in Go they are two separate things. In protobuf, the two are semantically identical.
@puellanivis message is not a scalar type. Wire type of emdedded message is essentially same as string, byte array or any other repeated value and encoding reference confirms that:
embedded messages are treated in exactly the same way as strings (wire type = 2)
And page that you linked says:
For message fields, the field is not set. Its exact value is language-dependent. The default value for repeated fields is empty (generally an empty list in the appropriate language).
Also keep in mind, that page you linked says about parsing messages when field not found on the wire at all and not about encoding:
When a message is parsed, if the encoded message does not contain a particular singular element
While messages are not a “scalar value type”, any field that is not a repeated
field is a “scalar message field”. So, the clause provided concerning “The default value for repeated fields…” does not apply in this case, as the message field is not repeated
.
I do not know what to do but explain it again: the proto3 standard specifically defines that “not-present” and “default/zero value” are semantically identical. This applies for message
s as much as it does for any other non-repeated field.
I understand that in Golang nil
and &pb.Point{}
are semantically different, and distinguishable. But from the perspective of protobuf, they are still identical. The x
field of a (*pb.Point)(nil)
message is still 0
and why there is the GetX()
receiver method.
Since “a message does not appear” and “a message is empty” are semantically identical, and the former is a shorter encoding, the shorter encoding is done in the wire format.
the proto3 standard specifically defines that “not-present” and “default/zero value” are semantically identical
They are semantically identical for scalar types as per documentation.
Scalar types, as per documentation are "one of the following types". And there's no "message" type in this table.
This particular page says:
For message fields, the field is not set. Its exact value is language-dependent.
And this section of documentation is about decoding values from wire when field in question is omitted. This is not applicable to the issue.
But from the perspective of protobuf, they are still identical.
No, you can distinguish between 1A 02 0A 00
which is PointEvent{Point: {}}
and 1A 00
which is PointEvent{Point: nil}
. Which also means that "empty message", in fact, takes event more bytes on the wire.
Also, from documentation on Go generated code:
Message fields can be set to nil, which means that the field is unset, effectively clearing the field. This is not equivalent to setting the value to an "empty" instance of the message struct.
There's no difference here between proto2
and proto3
.
in Golang nil and &pb.Point{} are semantically different
It is the same in generated code for different languages:
has_field
methodhasField
methodI'm unable to follow the example in the initial post. If the problem is with proto.Marshal
then you should be able to reduce the reproduction to a simple pair of proto.Marshal
and proto.Unmarshal
calls without channels are other other unrelated logic distracting from and obscuring what is expected.
While messages are not a “scalar value type”, any field that is not a repeated field is a “scalar message field”.
FYI, repeated fields, map fields, and messages are not scalar. All other field types are scalar.
Both proto2
and proto3
implement presence for message-valued fields, preserving a distinction between an empty message (&pb.M{}
) and an absent one ((*pb.M)(nil)
) across marshal/unmarshal. (Neither implements presence for repeated or map fields; there is no distinction between an empty list/map and one which is not present.)
I'm unable to follow the example in the initial post. If the problem is with
proto.Marshal
then you should be able to reduce the reproduction to a simple pair ofproto.Marshal
andproto.Unmarshal
calls without channels are other other unrelated logic distracting from and obscuring what is expected.
Honestly, I thought it would be better than answering obvious questions like "are you sure there's no r/w operations without mutexes", "are you sure client packets indeed contain empty message and not missing field" and so on but ok.
Basically, it reads protobuf package from tcp socket and calls proto.Unmarshal
var cmd pd.PointCommand
proto.Unmarshal(data, &cmd)
Then nested field checked for nilness and assigned to new protobuf-generated struct:
evt := pd.PointEvent{
Src: cmd.Src,
}
Reference to this object wrapped in proto.Message
then distributed through channels to receiving end and marshaled using proto.Marshal
from multiple goroutines like this:
// var ch <-chan proto.Message
msg := <- ch
proto.Marshal(msg)
In the end, sometimes client gets 1A 00
instead of 1A 02 0A 00
under unknown circumstances.
This still doesn't help us. What is the value of cmd.Src
? You issue seems to be about the PointEvent.Src
field being omitted sometimes, but that's entirely dependent on what's being stored into that field. If it's nil
, then the fact that the field is omitted is entirely expected.
What is the value of cmd.Src?
cmd.Src
has type *Point
. Point
is a struct generated from protobuf schema. It may not be nil
since, as I pointed out, nested field checked for nilness before sending to receivers. So, it may contain any combination of two int32
values.
In controlled environment, when cmd.Src == Point{}
it never ends up being nil on the wire. But in production we caught errors when nested field can be encoded as nil
despite preliminary checks and without modifying operations in-between.
Again, I'm going to ask that you provide a self-contained reproduction that we can actually run.
Again, I'm going to ask that you provide a self-contained reproduction that we can actually run.
I can't provide self-contained reproduction since I can't guarantee reproduction by myself because an issue seem to be concurrency-related and rarely occurs.
Only thing I know for sure that there is PointEvent.Src
field pointing to PointCommand.Src
which is empty but not nil
. proto.Marshal
then called from multiple goroutines using &PointEvent{...}
as proto.Message
value.
If you just call proto.Marshal
on &PointEvent{Src: &Point{}}
it will write 1A 02 0A 00
as you would indeed expect.
PointEvent.Src
is nil, then proto.Marshal
will marshal it in such a way that proto.Unmarshal
will unmarshal a PointEvent
as a nil PointEvent.Src
.PointEvent.Src
is non-nil, then proto.Marshal
will marshal it in such a way that proto.Unmarshal
will unmarshal a PointEvent
as a non-nil PointEvent.Src
.If you are not observing that behavior, then there is nothing we can do without a reproduction.
If
PointEvent.Src
is non-nil, thenproto.Marshal
will marshal it in such a way thatproto.Unmarshal
will unmarshal aPointEvent
as a non-nilPointEvent.Src
.
In my case Marshal
and Unmarshal
called for different objects of different types so yes, I am observing different behaviour:
proto.Unmarshal
called for object whose type was generated from message PointCommand
which has Point src = 2;
proto.Marshal
called on object, whose type was generated from message PointEvent
which has Point src = 3;
PointEvent.Src == PointCommand.Src
) and PointCommand.Src != nil
.I don't think that it's important though.
Is there some trickery inside proto.Marshal
that involves usage of unexported fields which might modify struct fields or somehow alter behaviour of marshaling process?
Is it guaranteed that proto.Marshal
is goroutine-safe when you call it multiple time in parallel on the reference to same object?
I don't mean to be rude, I just really haven't dug really deep into internals of marshaler but it is kind of important to know.
In my case Marshal and Unmarshal called for different objects of different types so yes, I am observing different behaviour:
That is some sketchy behavior. I recommend never relying on being able to marshal type T
and unmarshal it into a type R
. Doing so is asking for compatibility issues left and right. That said, the fundamental problem in this situation is that the field numbers are different between PointComment
and PointEvent
:
PointCommand.Point
has a field number of 2PointEvent.Point
has a field number of 3I wrote a more minimal version, but it’s still outputing an empty message rather than an absent one. (It appeared that it was empty, because it was just a single message encoding. Which happened to be the sub-message not the top-level message, which is implicit.)
OK, so catching up, I find that my code is still relevant, because I get 1A 00
from just this code:
package main
import (
"fmt"
pb "github.com/puellanivis/empty-pb-message/proto"
"google.golang.org/protobuf/proto"
)
func main() {
src := &pb.Point{
}
evt := &pb.PointEvent{
Src: src,
}
data, err := proto.Marshal(evt)
if err != nil {
fmt.Println("!!!", err)
return
}
fmt.Printf("% X\n", data)
}
So, the 1A 02 0A 00
is probably the unexpected value?
So, the
1A 02 0A 00
is probably the unexpected value?
Not at all, since src
is not nil, it should be 1A 02 0A 00
and not 1A 00
as per documentation.
Not at all, since src is not nil, it should be 1A 02 0A 00 and not 1A 00 as per documentation.
If PointEvent.Src = nil
then it prints nothing. 1A 00
is: field 3, zero length
. Which makes sense because it has no non-default values.
While messages are not a “scalar value type”, any field that is not a repeated field is a “scalar message field”.
FYI, repeated fields, map fields, and messages are not scalar. All other field types are scalar.
Both
proto2
andproto3
implement presence for message-valued fields, preserving a distinction between an empty message (&pb.M{}
) and an absent one ((*pb.M)(nil)
) across marshal/unmarshal. (Neither implements presence for repeated or map fields; there is no distinction between an empty list/map and one which is not present.)
I appear to have gotten confused, because rendering just a single embedded Point
message in a PointEvent
message gave me a single zero-length message. I didn’t recognize that this was the embedded Point
not the surrounding PointEvent
. 🤦 I would like to blame not having my coffee, but I think it was well into the afternoon by that point. 😆
Closing as inactionable. Mismatching types when serializing and deserializing is going to run into strange effects sooner or later. Furthermore, the effects of this is more driven by the protobuf wire format than it is by a specific implementation. As such there's isn't anything specifically actionable on our end.
Mismatching types when serializing and deserializing is going to run into strange effects sooner or later.
I think there's some misunderstanding here. You mentioned earlier that
I recommend never relying on being able to marshal type
T
and unmarshal it into a typeR
.
This implies that conversion between T
and R
happens through wire format but it isn't the case. I am not unmarshaling into T
from binary representation of R
.
Conversion happens after unmarshaling from type T
by copying fields to R
using generated code and then object of type R
marshaled into its binary representation.
Basically, this:
var T PointCommand
proto.Unmarshal(/* data */, &cmd)
R := pd.PointEvent{Src: cmd.Src}
proto.Marshal(&R)
Is it guaranteed that proto.Marshal is goroutine-safe when you call it multiple time in parallel on the reference to same object?
Yes, so long as that object is not modified during marshaling.
Is there some trickery inside proto.Marshal that involves usage of unexported fields which might modify struct fields or somehow alter behaviour of marshaling process?
Marshaling can modify internal state for messages containing extensions, but it does so in a concurrency-safe fashoin.
If you think there's a problem here, we really need a test case that we can run that demonstrates the issue.
What version of protobuf and what language are you using? package:
google.golang.org/protobuf v1.25.0
protoc-gen-go:v1.25.0
libprotoc:3.6.1
What did you do?
II have a system which, among other things, passes messages to multiple receivers. There's simplified version of it, which should be sufficient to show how it works with protobuf messages:
schema.proto
```protobuf syntax = "proto3"; package package; option go_package = "proto.definitions/generated/package"; message Point { sint32 x = 1; sint32 y = 2; } message PointCommand { Point src = 2; } message PointEvent { Point src = 3; } ```code.go
```go package main import ( "fmt" "io" "net" "google.golang.org/protobuf/proto" pd "proto.definitions/generated/package" ) var watchers map[uint64]chan<- proto.Message type Session struct { writer io.Writer } func (sess *Session) Listen(conn net.Conn) { sess.writer = conn for { // data []byte : read from conn var cmd pd.PointCommand if err := proto.Unmarshal(data, &cmd); err != nil { panic(err) } if cmd.Src == nil { panic(fmt.Errorf("nil src")) } evt := pd.PointEvent{ Src: cmd.Src, } for _, w := range watchers { w <- &evt } } } func (sess *Session) Watch(ch <-chan proto.Message) { for msg := range ch { data, err := proto.Marshal(msg) if err != nil { panic(err) } // send data to sess.writer } } func ListenAndServe(addr string) error { l, _ := net.Listen("tcp", addr) defer l.Close() for { conn, _ := l.Accept() sess := new(Session) { var id uint64 // generate unique id ch := make(chan proto.Message) watchers[id] = ch go sess.Watch(ch) } go sess.Listen(conn) } } func main() { go ListenAndServe(":xxxx") <-make(chan struct{}) } ```While most of the wrapping code wiped out, the key part is still there: message passed without any modifications after creation and then marshaled separately for different receivers. However, wire data may not have
src
field despite explicit checking.User code of course was the main suspect and I ran application with race detector without any error output. In controlled environment, when
src
set toPoint
with default values,proto.Marshal
produces expected serialization result with zero-length field value.What did you expect to see?
Empty (but not
nil
) nested structs should always be marshaled as empty ones.What did you see instead?
Sometimes marshaled data written to
Writer
has no serializedsrc
field even though it can't benil
.