Closed viathor closed 5 years ago
I'd like to revisit this, because I'm currently dealing with some protobuf translation issues but want to get input before I put in too much work. I think it's nice to use the protobuf classes directly for some use cases (and we could even enable stub generation to get type checking of the messages). Also, there's an existing json_format
module that provides MessageToDict
and ParseDict
functions, so that we easily translate between proto messages and python dicts (or json strings) in the canonical format without writing any additional code. Thoughts?
What do you see as the advantages of using the protobuf classes, instead of just dictionaries? These are serialized objects; we mostly interact with them in code that is converting into or out of the representation. Is that code noticeably difference if we use the protobuf classes instead of dicts?
I agree that outside of conversion code, when actually operating on the representation, you want more type safety than a dictionary.
One issue is that right now we are hand-rolling the serialization code to try to be compatible with the canonical json representation of protobuf message, but we differ in some cases. For example, protos represented as json messages can omit fields that have default values, but in GridQubit
we currently require that the row
and col
fields be present which does not agree with the canonical representation. That code is marked as deprecated, but it's symptomatic of trying to roll our own serialization code where we could just use the protobuf libraries.
I agree that from the perspective of a user there's almost no difference because as you say its just serialized data. But when implementing backend logic that receives values as protobuf messages, it's odd to have to convert to dicts first in order to be able to then deserialize into cirq objects. This backend logic does not even live in cirq, which is part of why I seem to be the only one running into issues around this :-)
Another thing I'll say in favor of writing protobuf (de)serialization functions is there are far fewer magic strings in the code because we don't have to write the field names manually.
Another wrinkle: the official JSON format specification for protobuf says that field names are mapped to lowerCamelCase
unless a field is annotated with a json_name
option. We are definitely not following that in our hand-rolled serialization code.
The datapath from cirq on a client's machine to the quantum computers in the cloud should be simple and easy to maintain.
Rolling our own JSON format and shoe-horning it into protobuf-compatible shape somewhere along the cirq-cloud datapath is highly likely to result in bugs, outages and tears.
Therefore, we should avoid hand-rolled serialization code. Since protobufs have a standardized JSON representation that comes with tested and battle-hardened implementation, using protobufs is the simplest choice with lowest maintenance cost down the road.
Fixed by #1717
When code generated from protocol buffer defs was removed in #825, a number of serialization/deserialization functions that used them (e.g.
sweep_to_proto()
) were changed to return/acceptdict
instead. We have recently undone the removal (see #1085).Should also undo the serialization/deserialization change? Protocol buffers offer stronger type safety than
dict
.