as issue #73 describes, currently the C++ library, Java, and scala library implement the format independently and they may have some miss-match between the implementation. It's better to standardize the format with protobuf and the libraries rely on the definition.
The standardized format definition would bring much benefits:
More clear definition to the C++/Java code
Make the format cross-language compatibility
It would not only carry the schema of graph, but also contain the optional metadata like vertex num/edge num, the chunk statistics. ( which we serialize to files directly)
It can be evolve without change the library code immediately
arrow and parquet are define their format with IDL too.
But the changes may break the breaking changes to current public APIs if we adapt the implementation to the format protocol.
What changes are included in this PR?
Add format protocol definitions files with google protocol buffers
Reason for this PR
as issue #73 describes, currently the C++ library, Java, and scala library implement the format independently and they may have some miss-match between the implementation. It's better to standardize the format with protobuf and the libraries rely on the definition.
The standardized format definition would bring much benefits:
But the changes may break the breaking changes to current public APIs if we adapt the implementation to the format protocol.
What changes are included in this PR?
Are these changes tested?
yes
Are there any user-facing changes?
not yet
not yet