tafia / quick-protobuf

A rust implementation of protobuf parser
MIT License
452 stars 87 forks source link

Codegen: unknown fields should be retained #93

Open scottlamb opened 6 years ago

scottlamb commented 6 years ago

If I parse a message with unknown fields and then serialize it, I'd like them to be written back out. This is a big safety improvement; it means that if there are still older clients accessing a shared database, they don't break things they don't understand.

This matches the behavior of the official C++ implementation for both proto2 and proto3. (proto3 originally dropped support for this, but it was added back after a huge backlash.)

tafia commented 6 years ago

Yes, like #92 this should be implemented with a dedicated flag on pb-rs.

imoverclocked commented 6 years ago

Maybe we should optionally retain them since proto3 is "in the wild" with this behavior now? It's possible some people are relying on not having extra fields once the local proto3 definition removes the field.

Use-case: someone imports com.googlecode.protobuf.format.*Format and uses it to output json/xml/etc. Adding unknown fields back into proto3 in a new version now pollutes the output and potentially breaks known schemas.

scottlamb commented 6 years ago

Use-case: someone imports com.googlecode.protobuf.format.*Format and uses it to output json/xml/etc. Adding unknown fields back into proto3 in a new version now pollutes the output and potentially breaks known schemas.

Those formatters probably drop unknown fields. I haven't looked at them, but without a descriptor that mentions the field, all they have is a tag number and wire type (e.g. varint); they can't know the correct name, type (e.g. int32/int64), if it's a repeated field, etc., so they have limited options when outputting formats that expect textual field names and such.

In the official C++ client, there's no global/per-file/per-message knob for unknown field behavior AFAIK. You can explicitly drop unknown fields on a particular object when desired via myproto.mutable_unknown_fields().Clear().