Open darioielardi opened 5 years ago
same issue
You can use the hasFoo getters to know if a field is set or not.
I agree that the pre-initialized default values are kind of strange. I do not know the historic origins of those. I'm just afraid it is too late to change this behavior without breaking too much code.
Hi, any update on this issue? This may cause huge amount of memory usage - which is not used in any case
To clarify a bit. The fields are not initialized if they are not present.
Instead you get a default value (created lazily) when you get an unset field.
Thank you for your quick response!
Using the information you provided, I have changed :
dynamic _$getND(int index) {
var value = _values[index];
if (value != null) return value;
return _getDefault(_nonExtensionInfoByIndex(index));
}
to:
dynamic _$getND(int index) {
var value = _values[index];
return value;
}
This solves the above issue. Instead of generating a ghost object on get, you return null.
Do you see any downsides for the above suggested change?
@sigurdm Appreciate your option on the above suggestion
I think the right thing to do here is to use has...
methods before reading the fields. In the original example, you should be using hasAddress
before calling the address
getter.
The proto3 spec says this about the default values of message fields:
For message fields, the field is not set. Its exact value is language-dependent. See the generated code guide for details.
So I think it's not necessarily a bug in the Dart implementation that a message field is initialized with some default when it's missing in an encoding as long as you can check if it was provided in the message. Here's a test using the proto definition in the bug report:
void main() {
// Empty message
{
User user = User.fromBuffer(<int>[]);
print(user.hasAddress()); // false
}
// address: {}
{
User user = User.fromBuffer(<int>[0x0a, 0x00]);
print(user.hasAddress()); // true
}
}
I will have to think more about @yanivshaked's suggestion, but I think it won't work. The problem is _$getND
is used in files generated for proto2 message as well, but the suggested change only makes sense in proto3 message fields. So it will break code generated for proto2 messages.
In any case, we probably won't be able to change the current getters for message fields as it will break a lot of code. We could however provide an alternative that returns null
for unset message fields in proto3 messages. (I will have to check what the ideal behavior here should be in proto2)
I think returning null
plays very nicely with Dart's many facilities for working with nullable types (?.
, ??
, promotion on testing etc), but this does require rethinking of the whole API. It is not something do be done without a comprehensive design that is subject to review. Fiddling with some methods to get an effect is not the way forward.
Regarding _$getND
, it is an internal implementation detail.
There is no reason that proto2 and proto3 need to use the same method, or that stored field values are accessed via an index, or that null
internally indicates a missing value. These are all implementation details that can change while providing the same API.
I just noticed this part in the proto3 spec default values section, right after the part I quoted in my previous comment:
Note that for scalar message fields, once a message is parsed there's no way of telling whether a field was explicitly set to the default value (for example whether a boolean was set to false) or just not set at all: you should bear this in mind when defining your message types
It seems like the spec assumes that existence of a field is indicated in the value of the field in the deserialized message and there are no hasX
methods like the ones we provide in protobuf.dart. With hasX
methods it's possible to tell whether a field was explicitly set or not, even when the field type is a scalar.
So when the spec says "For message fields, the field is not set. Its exact value is language-dependent" it doesn't mean (though they're not explicit about this) that we can use an arbitrary value for the message as the default, rather it means it should be indicated as null
(e.g. in Java, Dart, ...), or Option
/Maybe
in languages with sum types and no null (e.g. Rust, OCaml, Haskell, ...), or nil
in languages like Scheme, Lua, etc. the exact value is language dependent, but it's not a valid message value.
I still think what we have in protobuf.dart library should be good enough and this issue should not block anyone (just use has...
method for your field), but if my reading of the spec, as described above, is correct, then returning null
for missing fields with a message type should be more consistent with protobuf implementations in other languages and is more in line with what the spec says.
Maybe something to consider for the next major version bump.
Just came across this design document about field presence in proto3 which is saying the same thing I wrote in my previous comment above:
When serializing, fields with no presence are not serialized if they contain their default value.
- ...
- For messages, the default is the language-specific null value.
Update: b/217596113 reports the same issue.
I checked how protobuf implementation in other langauges do this.
Summary:
Kotlin is added to both because it provides two different getter: one that returns empty message, one that returns null
.
Returns a read-only empty message.
https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#embeddedmessage
const Bar& foo() const
: Returns the current value of the field. If the field is not set, returns a Bar with none of its fields set (possiblyBar::default_instance()
).
Returns nil
for unset message fields, for both required
and optional
.
https://developers.google.com/protocol-buffers/docs/reference/go-generated
// The generated code is the same result if required instead of optional ... Message fields can be set to nil, which means that the field is unset, effectively clearing the field. This is not equivalent to setting the value to an "empty" instance of the message struct.
Returns an empty message.
https://developers.google.com/protocol-buffers/docs/reference/java-generated
If the field is not set,
getFoo()
will return a Foo instance with none of its fields set (possibly the instance returned byFoo.getDefaultInstance()
).
Returns empty message, but also provides an accessor that returns null
if the field is not present.
https://developers.google.com/protocol-buffers/docs/reference/kotlin-generated
If the field is not set, returns the default value ... Note that there is no special handling of submessages ... We also provide an extra field accessor
FooOrNull
for all optional message fields that returns the value of the field if it is present and null otherwise.
As far as I can see it's not documented.
However there's something interesting in Python implementation: if a message field is not set, you can still set a field of the field (i.e. msg.nestedMsg.field = 1
when nestedMsg
is not set), and it automatically sets the message field:
https://developers.google.com/protocol-buffers/docs/reference/python-generated#embedded_message
Message types work slightly differently. You cannot assign a value to an embedded message field. Instead, assigning a value to any field within the child message implies setting the message field in the parent
I considered a similar semantics for Dart implementation, but I think it's a bad idea. Consider:
Msg myMsg = Msg.create();
NestedMsg nestedMsg = myMsg.nestedMsg;
...
myMsg.nestedMsg.field = 123;
With the semantics described in Python API (quoted above), after the last line above, will nestedMsg.field
return 123
, or 0
? It depends:
nestedMsg
was previously unset, and nestedMsg
returns empty message (rather than None
), then it will return 0
, as the second line will return the default message, and last line will update the field with a new message.nestedMsg
was previously set, then it will return 123
as last line updates the same message nestedMsg
references to.It's not difficult to see how much headache this kind of semantics will cause if unset message fields are not None
.
Even when unset message fields are None
it seems like a bit too much magic to me.
Returns nil
.
https://developers.google.com/protocol-buffers/docs/reference/ruby-generated#embedded_message
For submessages, unset fields will return nil, so you can always tell if the message was explicitly set or not
Unclear. It says you can assign null
to clear a message field, but doesn't say what happens if you read such a field.
https://developers.google.com/protocol-buffers/docs/reference/csharp-generated#singular
Message fields can be set to null values, which is effectively clearing the field. This is not equivalent to setting the value to an "empty" instance of the message type
Returns empty message
https://developers.google.com/protocol-buffers/docs/reference/objective-c-generated#singular3
The default "empty" value for a message is an instance of the default message. To clear a message value it should be set to nil. Accessing a cleared message will return an instance of the default message and the hasFoo method will return false.
Returns nil
https://developers.google.com/protocol-buffers/docs/reference/php-generated#embedded_message
A field with a message type defaults to nil, and is not automatically created when the field is accessed
C# they're refusing to implement nullable types because they say that it would cause languages that don't have nullable types to have different behavior.
optional should be nullable. They should be one to one correspondance. This is breaking languages with nullable types and hobbling their compile time safety of protobuf derived classes and null is absolutely a valid value for all fields because it literally means "no entry" which is entirely valid and optional defines if that is a possible case.
Everyone working on protobuf needs to get together and agree: optional = nullable and get all implementations in line and consistent ASAP.
agreed, this is weird. switched from json to protobufs and now dealing with this problem.
you could also add a method called GetUser() or GetX() just like protobuf does for golang. This should return null or object/primitive
Given, for example, the following proto definitions:
the current behavior, when decoding an incoming User message, is to initialize the Address generated class even if the field is unset, filling it with empty ( zero ) values, which should not happen.
This is crucial for a variety of situations, for example using value wrappers, which are Protobuf Well-Known Types and represents the Google solutions to distinct a scalar field zero value from its default value. This is actually not possible with the current dart implementation, because the wrapper generated class is instantiated anyway and the inner value becomes a zero value.
Also, I think this is what was referred in #245.