Closed timofurrer closed 6 months ago
At least adding a basic protobuf definition with input that repros would be helpful for understanding.
@puellanivis I've updated the description.
Hm, it works if you set protojson.UnmarshalOptions{ DiscardUnknown: true }
I suppose the question is, should reserved field names be discarded as “known”. From the text of https://protobuf.dev/programming-guides/proto3/#fieldreserved:
To make sure JSON and TextFormat instances of your message can still be parsed, also add the deleted field name to a reserved list.
This suggests that the JSON should continue to parse with the reserved field name.
Yes, I'm aware that I can discard unknown fields, but that would discard "all" unknown fields - which is not what I want.
This suggests that the JSON should continue to parse with the reserved field name.
I agree - so that means that protojson doesn't behave according to this, right?
This suggests that the JSON should continue to parse with the reserved field name.
I agree - so that means that protojson doesn't behave according to this, right?
We’ll need more input from contributors, but I think from my reading that this is a misbehavior. I’ve also advised someone based on this guidance that I suspected reserving a field name would be that it would be recognized as known but ignored, not breaking like this.
We’ll need more input from contributors
@puellanivis anything we (or I) can do to get that "input" ?
You could work on a gerrit change request yourself. I think the in-house Google devs are otherwise busy with editions support.
Thanks for reporting this. I think this should be fixed. The behavior is documented in the spec and the go implementation is not following it. I double checked and the C++ implementation for example follows the spec.
I'll try to prepare a change for this and test is in Google first to see if anyone relies on the old behavior (I don't expect anyone to but you never know).
@lfolger that's good news that you'll have a look at this - it's highly appreciated, thank you 🤝
While looking into this I found out that this is already the behavior today. Could you clarify the steps to reproduce?
These are the tests that I added and they succeeded without errors: https://go-review.googlesource.com/c/protobuf/+/581095
I just realized this is about protojson. Let me try to add a test for that as well.
Unfortunately, the C++ implementation has the same issue. Usually the C++ implementation is right and diverging from it means in most cases diverging from the spec. I reached out to the core protobuf team to clarify. I'll report back once I know more.
Sorry for the back and forth.
Short update. There is still an ongoing discussion internally about what the behavior should be. I will come back to this once I know more.
AFter discussion it turned out that the current behavior is a "bug" in the Text proto parser and intended behavior in the JSON format. I call this "bug" because the implementation is more permissive that it has to be which most users don't mind.
A documentation update to the official documentation will follow shortly.
If you like the behavior to be changed, you would need to file a feature request against the protobuf.
Just to manage expexctations, I'm not very hopeful that they will accept this but you should give it a try nonetheless. Maybe there will be enough users and use cases that need this and they will accept it. If that happens, we are happy to add this to the Go implementation.
I'm closing this issue for now but feel free to reopen it if you have more questions.
Hi all! I've been following this issue. I don't think this has been mentioned, I'd like to point out an inconsistency between binary proto unmarshaling and JSON->proto message unmarshling. Reserved fields are ignored in the binary format and you don't get an error. I think this makes sense. I wonder why is it not the same for any other format that can be unmarshaled into a proto message?
If it is how it is and the behavior is not going to change, then:
Yet another way to think about it: currently reserved
is a no-op for JSON unmarshaling. To me it is a signal that something is not right - either the spec defines something useless (if we forget about the binary format, with which JSON is not consistent) or the implementation doesn't follow the spec.
p.s. Would be great to hear some of the counter-arguments from the internal discussion. To be honest, it's a bit annoying that this is being discussed behind closed doors. There are lots and lots of users of the format and libraries and there is an open issue tracker and this project is open source in the first place. Why all that if not for transparency and collaboration? So, I'm sure a bit more transparency would be greatly appreciated by the community. Thank you.
The semantics of it has pretty much always been documentation of fields to not re-use because they’ve been removed. I believe it is an error during the compilation if a field attempts to reuse either a field number or field name, so it is not strictly a fully no-op.
The spec literally mentions that reserved can be used so that JSON can still be parsed (from last sentence in second paragraph here):
If you update a message type by entirely deleting a field, or commenting it out, future developers can reuse the field number when making their own updates to the type. This can cause severe issues, as described in Consequences of Reusing Field Numbers.
To make sure this doesn’t happen, add your deleted field number to the reserved list. To make sure JSON and TextFormat instances of your message can still be parsed, also add the deleted field name to a reserved list.
I agree with @ash2k here that it's kinda frustrating to fix a bug by changing the spec.
Assuming there is NO way that this gets fixed (which would be really sad), would you be open to add an opt-in config to the json marshaller to enable this behavior? .. and make json parsing align to the behavior to binary parsing?
"Text proto parser has a bug" - perhaps this issue can be used to fix it
First of all, when I said the Go implementation has a bug in the text parser, I meant that it accepts messages with unkown but reserved fields. To be clear only C++ and Go do this. All other implementation we know don't do this. This means JSON and text parsers behave the same (except for C++/Go).
Would be great to hear some of the counter-arguments from the internal discussion. To be honest, it's a bit annoying that this is being discussed behind closed doors.
About the discussion not being public, I was not even part of that discussion. I think if you want a public discussion, you need to file a feature request against the protobuf main repository which handles changes to the spec. It was partially me fault in that I reached out to them internally rather than just asking you to file a bug against the proto repo directly. Sorry about that.
I'd like to point out an inconsistency between binary proto unmarshaling and JSON->proto message unmarshling. Reserved fields are ignored in the binary format and you don't get an error
The accepting of unknown field in the binary format works very different. It accepts all unknown fields. It is completely independent of the reserved fields. You can get the same behavior by using the appropriate unmarshaloptions for protojson and prototext.
Again, there is no point in discussion this here. We, the Go protobuf maintainers, are not responsible for the spec. If you would like to see the spec changed, file a feature request against the proto buf repository. All we can do is to follow the spec.
What version of protobuf and what language are you using? Version: Go, v1.5.4
What did you do?
We have a message with a reserved field and we want to unmarshal some JSON that has this field.
Running this results in an unmarsahling error like
unknown field "should_be_ignored"
.What did you expect to see?
The JSON unmarshals correctly, but ignores the reserved fields
What did you see instead?
The decoder fails to unmarshal complaining about not knowing the reserved field.
Anything else we should know about your project / environment?