EasyProtoBuf is a single-header C++11 ProtoBuf library that is
Sorry, I fooled you... It's even easier!
Codegen translates .proto files into plain C++ structures and generates encode/decode functions (de)serializing these structures into ProtoBuf format. So, if you know how to use C++ structs, you just learned how to use EasyProtoBuf. Scrap the docs, and have a nice beer! The rest is written for water lovers.
Library features:
Codegen features:
Files:
Portability:
<cstdint>
CI: while the final goal is to support any C++11 compiler, so far we tested only:
Implemented so far:
From this ProtoBuf message definition...
message Person
{
required string name = 1 [default = "AnnA"];
optional double weight = 2;
repeated int32 numbers = 3;
}
... Codegen generates the following C++ structure...
struct Person
{
std::string name = "AnnA";
double weight = 0;
std::vector<int32_t> numbers;
...
};
... that follows the official ProtoBuf guidelines on the ProtoBuf->C++ type mapping,
while enclosing repeated types into std::vector
.
And on top of that, Codegen generates two functions that encode/decode Person in the ProtoBuf wire format:
// Encode Person into a string buffer
std::string protobuf_msg = easypb::encode(person);
// Decode Person from a string buffer
Person person2 = easypb::decode<Person>(protobuf_msg);
And that's all you need to know to start using the library. Check technical details in Tutorial.
Even if you are going to implement your own encoder or decoder, we recommend using Codegen to get a blueprint for your code. For Person (see above), the generated code is:
void Person::encode(easypb::Encoder &pb) const
{
pb.put_string(1, name);
pb.put_double(2, weight);
pb.put_repeated_int32(3, ids);
}
void Person::decode(easypb::Decoder pb)
{
while(pb.get_next_field())
{
switch(pb.field_num)
{
case 1: pb.get_string(&name); break;
case 2: pb.get_double(&weight); break;
case 3: pb.get_repeated_int32(&ids); break;
default: pb.skip_field();
}
}
}
So, the API consists of the following class methods (where FTYPE is the Protobuf type of the field, e.g. 'fixed32' or 'message'):
The field number is the first parameter in put* calls, and placed in the case label before get* calls.
You can use the returned value of get_FTYPE method instead of passing the variable address,
e.g. weight = pb.get_double()
.
get_FTYPE(&var)
accepts an optional second parameter - a pointer to a bool variable,
e.g. pb.get_string(&name, &has_name)
.
This extra variable is set to true
after the modification of var
,
allowing the program to check which fields were actually present in the decoded message.
This form of get_FTYPE
is employed in the code generated by Codegen,
both for required and optional fields.
EasyProtoBuf is a single-header library. In order to use it, include easypb.hpp.
All exceptions explicitly thrown by the library are derived from easypb::exception. It may also throw std::bad_alloc due to buffer management.
Start encoding with the creation of the Encoder object:
easypb::Encoder pb;
Then proceed with encoding all present fields of the message:
pb.put_string(1, name);
pb.put_double(2, weight);
pb.put_repeated_int32(3, ids);
Finally, retrieve the encoded message from the Encoder object:
std::string protobuf_msg = pb.result();
This call clears the contents of the Encoder, so it can be reused to encode more messages.
The first parameter of any put_*
call is the field number,
and the second parameter is the value to encode.
There are several groups of put_*
methods:
put_FTYPE
, e.g. put_string
, encodes a single value.put_repeated_FTYPE
, encodes multiple values in one call.
The second parameter should be an iterable container.put_packed_FTYPE
, is similar to put_repeated_FTYPE
,
but encodes data in the packed format.put_map_FTYPE1_FTYPE2
, e.g. put_map_string_int32
serializes the map type map<string, int32>
.
The second parameter should be a compatible C++ map container,
e.g. std::map<std::string, int32_t>
.FTYPE
here should be replaced by the ProtoBuf field type
of the corresponding message field, e.g. int32
, bytes
and so on,
except that for any message type we use the fixed string message
.
The Decoder keeps only the raw pointer to the buffer passed to the constructor. Thus, the buffer should neither be freed nor moved until decoding is complete.
The code generator is described in the separate documentation.
Despite its simplicity, the library is quite fast, thanks to the use of std::string_view (e.g. avoiding large buffer copies) and efficient read_varint/write_varint implementation.
On pre-C++17 compilers, the library uses its own implementation of string_view to ensure good performance, or a user can supply his own type as EASYPB_STRING_VIEW preprocessor macro, e.g. define it to std::string.
Sub-messages and packed repeated fields always use 5-byte length prefix (it can make encoded messages a bit longer than with other Protobuf libraries).
Compared to the official ProtoBuf library, EasyProtoBuf allows more flexibility in modifying the field type without losing the decoding compatibility. You can make any changes to the field type as long as it stays inside the same "type domain":
int*
, uint*
)It starts with the story of my FreeArc archiver:
And the best way to pass a lot of parameters to a C++ function is a plain C struct. Using a serialization library to pass such a struct between languages greatly simplifies adding bindings to the core API for new languages, such as Python, JavaScript, and so on. So I decided to provide the backend API as a few functions accepting serialized data structures for all their parameters.
At this moment, I started to research various popular serialization libraries and finally chose the ProtoBuf format:
So, I started to look around, but the tiniest C++ ProtoBuf library I found was still a whopping 4 KLOC (while it neither supports maps nor provides a bindings generator). This made me crazy - the entire ProtoBuf format is just 5 field types, what do you do in those kilolines of code?
You guessed it right - I decided to write my own ProtoBuf library (with maps and codegen, you know). The first Decoder version was about 100 LOC and today the entire library is still only 666 LOC, encoding and decoding all ProtoBuf types including maps. Nevertheless, the library I rejected eventually provided many insights, from API to internal organization, so it may be called the father of EasyProtoBuf.