apache / horaedb

Apache HoraeDB (incubating) is a high-performance, distributed, cloud native time-series database.
https://horaedb.apache.org
Apache License 2.0
2.64k stars 207 forks source link

Support Arrow format in gRPC service #88

Closed waynexia closed 1 year ago

waynexia commented 2 years ago

Description

The query interface now supports avro and json format. https://github.com/CeresDB/ceresdb/blob/f6c9a5b03e23b04e6a06c4f73756f1be9551c0f8/server/src/grpc/query.rs#L42-L45 And avro is the only one actually used. Regardless of json, it's natural to support arrow (in arrow's ipc format) which is widely used in our server.

Proposal

Add support to arrow format. The execution result is RecordBatch so it wouldn't take a lot of effort to serialize it on the server side. Considering backward compatibility we can keep using avro as the default format, and client can require server to return a specific format it needs.

The protobuf is defined in https://github.com/CeresDB/ceresdbproto/blob/eba30f7dff736d00be711c40c3f01964655eb10a/protos/storage.proto#L101-L110

message QueryResponse {
  common.ResponseHeader header = 1;
  enum SchemaType {
    AVRO = 0;
    JSON = 1;
  }
  SchemaType schema_type = 2;
  string schema_content = 3;
  repeated bytes rows = 4;
}

It also might be necessary to do some renaming. Like SchemaType -> ResponseType, rows -> chunks etc.

Additional context

Rachelint commented 2 years ago

Maybe I can try it and make corresponding changes to rust client.

ShiKaiWi commented 1 year ago

This has been supported.