FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

add descriptor loader #87

Closed knoguchi closed 7 years ago

knoguchi commented 7 years ago

This is a preliminary work for the #78 "create ProtoSchema from descriptor file".

A "descriptor file" is a set of descriptors. A descriptor is a binary representation of the .proto file. Typically one (or more) of the descriptors is the main, and the others are dependencies.

This reader creates a NativeProtobufSchema object from the main descriptor. Dependencies are "copied" into the main descriptor before generating the object. I chose copying over implementing the import mechanism per #86 discussion. The import statement is a potential security risk, and it would certainly add the complexity.

This descriptor reader should work for simple message types that do not have references to other nested message types. #73 needs to be fixed.

Note: descriptor.proto was copied from Google Protobuf. The reserved fields were commented out because ProtoParser could not parse. See #84.

cowtowncoder commented 7 years ago

Ok, I think this could be a good starting point. There are some things I'd probably change, such as making ProtobufMapper initiate work instead of reading things in constructor (which is typically not something to avoid), but in general I think it makes sense. This would also remove the dependency from reader to constructing mapper which seems wrong way around. I think it also would make sense to take different kinds of input sources: assuming a String defines a resource path seems limiting; it may be one option but I assume there are other cases such as reading from an external file.

But one question first: what is intended usage pattern for ProtoFiles? I'd prefer not exposing 3rd party types as-is through API, if they can be encapsulated somehow. Can they be converted to protoc definition? Test seems to just verify that something was read, but I assume these are useful for some actual usage.

knoguchi commented 7 years ago

Thanks for the comments. The reason why I exposed ProtoFiles map is to let the driver to pick the main proto file, then call NativeProtobufSchema.construct(ProtoFile) API.
I will modify it to be something like below to hide the 3rd party API.

// in the DescriptorReader class
private Map<String, ProtoFile> protoFileMap;

NativeProtobufSchema forProto(String protoName) {
    return NativeProtobufSchema.construct(protoFileMap.get(protoName));
}

ProtoFile.toSchema() method can generate the .proto string but I did not intend the usage.

The test case barely shows that the two .proto files (main.proto and other.proto) are merged. I will add tests that actually check the correctness of the output.

Other items that you pointed out:

  1. I will move the ProtobufMapper for the descriptor schema out of the reader class.
  2. add input sources: URL InputStream byte[]
cowtowncoder commented 7 years ago

Unfortunately haven't had time to spend on this, but wanted to add a note to say it's still on my radar.

cowtowncoder commented 7 years ago

Actually I think I can make some changes here from API perspective, but retain functionality. Nothing big, just bit of touch up.

cowtowncoder commented 7 years ago

Ok so did refactoring, which is sizable by lines of code. But didn't much change functionality: mostly just changes access point, which is now via ProtobufMapper.