andrewhickman / prost-reflect

A protobuf library extending prost with reflection support and dynamic messages.
https://crates.io/crates/prost-reflect
Apache License 2.0
84 stars 19 forks source link

Getting vs. taking unset fields - inconsistent behavior #61

Open 124C41p opened 1 year ago

124C41p commented 1 year ago

When calling get_field_by_number on an unset scalar field, the default value is returned. When calling take_field_by_number, nothing is returned instead. (Tested on current version 0.11.4) More precisely:

let desc = DescriptorPool::global()
    .get_message_by_name("google.protobuf.BoolValue")
    .unwrap();
let msg = DynamicMessage::new(desc);
let value = msg.get_field_by_number(1).unwrap().into_owned();
println!("{value:?}")

This snippet prints Bool(false).

let desc = DescriptorPool::global()
    .get_message_by_name("google.protobuf.BoolValue")
    .unwrap();
let mut msg = DynamicMessage::new(desc);
let value = msg.take_field_by_number(1).unwrap();
println!("{value:?}")

This snippet panics.

124C41p commented 1 year ago

The second example also panics if the field is explicitly set to the default value. That is, the following snippet panics as well:

let desc = DescriptorPool::global()
    .get_message_by_name("google.protobuf.BoolValue")
    .unwrap();
let mut msg = DynamicMessage::new(desc);
msg.set_field_by_number(1, Value::Bool(false));
let value = msg.take_field_by_number(1).unwrap();
println!("{value:?}")
andrewhickman commented 1 year ago

I've always been slightly uneasy about the way the get_field methods return the default value if a field is unset. Its obviously useful for scalar fields in proto3, but for oneof fields or message types, it doesn't seem correct because it doesn't distinguish between the field being unset, and set with the default value.

I'd be open to having a take_field_by_number_or_default method which does what you want, and I'd like to have provide variants of get_field as well in future. The difference should probably be called out more explicitly in the docs. I'd be interested to hear others' thoughts on the API.

Its possible to emulate the behaviour of get_field using Value::default_value_for_field

let desc = DescriptorPool::global()
    .get_message_by_name("google.protobuf.BoolValue")
    .unwrap();
let field = desc.get_field(1).unwrap();

let mut msg = DynamicMessage::new(desc);
msg.set_field(&field, Value::Bool(false));
// No panic
let value = msg.take_field_by_number(1).unwrap_or_else(|| Value::default_value_for_field(&field));
println!("{value:?}")
124C41p commented 1 year ago

I see. It just felt counterintuitive to me that functions with very similar names have completely different semantics (when it comes to default values). It also is counterintuitive that asking for a value which I just set might result in None.

[...] it doesn't seem correct because it doesn't distinguish between the field being unset, and set with the default value. [...] I'd be interested to hear others' thoughts on the API.

I thought about this for a while now. I like the idea to have get/take_*_or_default functions in the API for convenience. But I think even more valuable would be get/take variants which exactly reflect the wire format. I.e. ideally take_field_by_number(i) should return Some(false) if it was serialized before in that way (although usually default values are omitted), and None otherwise. It would also be nice to have two variants of set functions like set_field_skip_default and set_field_preserve_default. In this way you give full knowledge and control over the wire format to the user. It might also reduce confusion.

(I might actually need these functions to imitate the semantics of python_betterproto with this crate accurately. But I'm not sure about this - just slowly getting there...)

andrewhickman commented 1 year ago

One idea I was toying with was an "entry-style" API for fields, e.g.

message.get_field(&descriptor).take_or_default() // always returns a value
message.get_field(&descriptor).take() // returns option
message.get_field(&descriptor).value_or_default() // always returns a value
message.get_field(&descriptor).value() // returns option
message_get_field_by_number(1).unwrap().value_or_default()
message_get_field_by_name("foo").unwrap().clear()

It would reduce some of the combinatorial explosion of different methods, and there's less chance for silent breakage like with changing the semantics of get_field

I'm not convinced by the skip_default/preserve_default APIs - I think treating default fields as equivalent to unset matches the proto spec most closely