apple / swift-protobuf

Plugin and runtime library for using protobuf with Swift
Apache License 2.0
4.56k stars 449 forks source link

Support protobuf text format serialization and deserialization #36

Closed BradLarson closed 7 years ago

BradLarson commented 8 years ago

I've been able to pull in binary protobuf data using init(protobuf:), and I've seen init(json:) to bring in JSON, but I can't seem to find an input initializer for text format protobuf data.

Specifically, I'm trying to pull in the network definition files from the Caffe framework, which are specified as .prototxt files (an example here). I have all the types from the protocol buffer compiler, run against their caffe.proto definition, and can pull in the .caffemodel binary protobuf from that page. I just can't figure out how to bring in the text format protobuf network data there.

My apologies for asking this as a question, but the documentation and code didn't make it clear if this was present or if I had overlooked something.

jcanizales commented 8 years ago

Yeah it's kind of hidden, but the text format isn't implemented yet. IIRC, you can transform a file from text to binary and vice versa by calling protoc with some flags (protoc --help should specify it).

tbkka commented 8 years ago

It is not yet implemented, but the current code was designed to make it straightforward to add this sort of capability.

For serialization, the easiest place to start would be from the ProtobufBinaryEncoding.swift source file. Deserialization is slightly trickier because you have to map the proto text field names to proto numbers before you can invoke the decodeField method to set the value. The JSON decoder might be an easier place to start.

thomasvl commented 8 years ago

Out of curiosity, what are you looking to use it for?

It was never really meant as a format for data exchanges, but can be useful in unittests/etc. but even there using protoc to turn text into binary for a unittest has the advantage of the apps not having to ship the parsing code, keeping them smaller.

BradLarson commented 8 years ago

@thomasvl - The specific case that I'm looking to support is neural network definition files. The Caffe framework, one of the more common ways to define and train convolutional neural networks, uses text format protobuf files to define neural network architectures. Binary protobuf format files hold the final trained network weights for that architecture, and I can read those without a problem.

The text format is used to provide a human-readable and human-modifiable description of a network architecture. I'd like to be able to pull that in as well as be able to write it out for use in anything that employs the Caffe framework. It could allow for an easy way to instantiate, design, and run neural networks on Mac, iOS, and Linux.

I'll see if I can build on the starting points mentioned above for serialization / deserialization.

BradLarson commented 7 years ago

As an update, I've hacked together text format serialization in my fork over here: https://github.com/BradLarson/swift-protobuf

I copied over the JSON encoding code and modified it to the point where it produces output that seems to match protoc's generated text format when testing against the TestAllTypes message type.

I don't have a full suite of tests for this yet, and I'm not sure about how it's handling bytes, but it looks like the test format export is moderately functional.

The decoding will be a challenge, but at least I seem to have half of this working.

BradLarson commented 7 years ago

The above-linked pull request is my finished work on the serialization and deserialization of these files. It's crude, but it does appear to work to load and save all text format protobuf files I've tried, and satisfies my current needs.

tbkka commented 7 years ago

Fixed via #169