tradewelltech / protarrow

Convert from protobuf to arrow and back
https://protarrow.readthedocs.io/
Apache License 2.0
17 stars 2 forks source link
apache-arrow data protobuf python

PyPI Version Python Version Github Stars codecov Build Status Documentation License Downloads Downloads Code style: black snyk

Protarrow

Protarrow is a python library for converting from Protocol Buffers to Apache Arrow and back.

It is used at Tradewell Technologies, to share data between transactional and analytical applications, with little boilerplate code and zero data loss.

Installation

pip install protarrow

Usage

Taking a simple protobuf:

message MyProto {
  string name = 1;
  int32 id = 2;
  repeated int32 values = 3;
}

It can be converted to a pyarrow.Table:

import protarrow

my_protos = [
    MyProto(name="foo", id=1, values=[1, 2, 4]),
    MyProto(name="bar", id=2, values=[3, 4, 5]),
]

table = protarrow.messages_to_table(my_protos, MyProto)
name id values
foo 1 [1 2 4]
bar 2 [3 4 5]

And the table can be converted back to proto:

protos_from_table = protarrow.table_to_messages(table, MyProto)

See the documentation