Remove CBOR dependency - Githubissues

cchudant commented 2 years ago

Description

We are currently using cbor for some of the serializing: transforming the flattened input tensors to a byte array. This is probably overkill, and having a dependency on cbor is troublesome for porting the client library to other languages, like javascript, as on npmjs cbor packages are either old or nodejs-only.

I see two ways of doing this:

either find a way to encode the tensors using protobuf directly, but we should keep in mind that we want to keep the data as small as possible on the wire.
implement our own way of turning tensors into byte arrays. This is probably very simple, and it would probably work by just packing the data into a contiguous array, reinterpret it as a byte array, and call it a day.

Motivation and Context

Dependency on cbor2 in the client and server side.

Test plans

Let's not think about backward compat :)

Additional Information

None

Checklist

[x] This issue concerns BlindAI Client
[x] This issue concerns BlindAI Server

cchudant commented 2 years ago

Actually, I propose we just do what ONNX is doing https://github.com/onnx/onnx/blob/main/onnx/onnx.proto#L479

JoFrost commented 2 years ago

Protobuf serialization has a size limit, I would rather like to avoid this kind of restriction (we can however test it, also, GRPC does include a Cbor-like serialization). However, your second solution to handle that part does sounds very interesting. I would be interested to be able to serialize a tensor directly, without asking the user to turn the tensor into a list first.

cchudant commented 2 years ago

Yes, we should continue using a streaming request, and splitting the array into different RunModelRequests. ONNX is kind of doing the second option I was proposing:

message Tensor {
  // The data type of the tensor.
  DataType data_type = 2;

  // Depending on the data_type field, exactly one of the fields below with
  // name ending in _data is used to store the elements of the tensor.

  // For float and complex64 values
  // Complex64 tensors are encoded as a single array of floats,
  // with the real components appearing in odd numbered positions,
  // and the corresponding imaginary component appearing in the
  // subsequent even numbered position. (e.g., [1.0 + 2.0i, 3.0 + 4.0i]
  // is encoded as [1.0, 2.0 ,3.0 ,4.0]
  // When this field is present, the data_type field MUST be FLOAT or COMPLEX64.
  repeated float float_data = 4 [packed = true];
  // For int32, uint8, int8, uint16, int16, bool, and float16 values
  // float16 values must be bit-wise converted to an uint16_t prior
  // to writing to the buffer.
  // When this field is present, the data_type field MUST be
  // INT32, INT16, INT8, UINT16, UINT8, BOOL, or FLOAT16
  repeated int32 int32_data = 5 [packed = true];
  // For strings.
  // Each element of string_data is a UTF-8 encoded Unicode
  // string. No trailing null, no leading BOM. The protobuf "string"
  // scalar type is not used to match ML community conventions.
  // When this field is present, the data_type field MUST be STRING
  repeated bytes string_data = 6;
  // For int64.
  // When this field is present, the data_type field MUST be INT64
  repeated int64 int64_data = 7 [packed = true];

  // Serializations can either use one of the fields above, or use this
  // raw bytes field. The only exception is the string case, where one is
  // required to store the content in the repeated bytes string_data field.
  //
  // When this raw_data field is used to store tensor value, elements MUST
  // be stored in as fixed-width, little-endian order.
  // Floating-point data types MUST be stored in IEEE 754 format.
  // Complex64 elements must be written as two consecutive FLOAT values, real component first.
  // Complex128 elements must be written as two consecutive DOUBLE values, real component first.
  // Boolean type MUST be written one byte per tensor element (00000001 for true, 00000000 for false).
  //
  // Note: the advantage of specific field rather than the raw_data field is
  // that in some cases (e.g. int data), protobuf does a better packing via
  // variable length storage, and may lead to smaller binary footprint.
  // When this field is present, the data_type field MUST NOT be STRING or UNDEFINED
  bytes raw_data = 9;
}

They pack stuff into different protobuf array types depending on the type of the input. This approach is compatible with splitting the tensor into multiple RunModelRequests.

The reason why they also support not packing everything into a bytes is interesting. I suggest we just focus on supporting the raw_data representation for now.

As for

without asking the user to turn the tensor into a list first.

I think we should do that yes! The Python client should accept torch tensors and numpy tensors. However, I think this is a separate issue that only affect the Python client and not how we serialize it to the wire. I will open a separate issue for that.

mithril-security / blindai

Remove CBOR dependency #79

Description

Motivation and Context

Test plans

Additional Information

Checklist