pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch
https://pytorch.org/executorch/
Other
2.13k stars 350 forks source link

RFC: Java layer EValue serialization #6569

Open kirklandsign opened 1 week ago

kirklandsign commented 1 week ago

🚀 The feature, motivation and pitch

Context: https://github.com/pytorch/executorch/issues/6470#issuecomment-2436060471

We need to design a way to serialize an EValue (and its embedding tensor), for some IPC use case in AOSP.

This won’t be the official serialization across ET. ET uses fbs for serialization. This is only for the Java frameworks layer for AOSP.

Basic layout

Tag (1 byte) Bytes_of_payload (8 bytes?) < < < < < < < Payload (var) … …

Where payload can be one of them

None (0 byte): No value or absence of a value. Tensor (var): A multi-dimensional array used for numerical computations. String/uint8_array (var): A sequence of characters, often used to represent text. Double (8): A 64-bit floating-point number, used for decimal arithmetic. Int (4 or 8?): A 32-bit integer, used for whole numbers. Bool (1): A boolean value, either true or false. ListBool (var): A list of boolean values. ListDouble (var): A list of double-precision floating-point numbers. ListInt (var): A list of integers. ListTensor (var): A list of tensors. ListScalar (var): A list of scalar values (e.g., numbers). ListOptionalTensor (var): A list of optional tensors, where each element may be present or absent.

Per Jacob, we don’t care about List types (6-11). They are internal to ET runtime.

For those without variant length (0, 3, 4, 5), we just serialize the value directly.

For 2 String, let’s assume that it’s uint8_t[]. We just serialize the array directly.

So we will focus on the tensor type.

Tensor type

Scalar_type (1 byte) Num_dim (1) Sizes (var) … Dim_order (var) … Data (var) … … … … …

Questions

Do we really need a field for Bytes_of_payload? Do we need Dim order? Do we need TensorShapeDynamism?

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

qiaoli31 commented 1 week ago

The proposal looks good to me. I saw you mentioned ET uses fbs serialization, we can consider it as an option as well. You could provide conversion java EValue <-> fbs and fbs definition in ET.

kirklandsign commented 1 week ago

The proposal looks good to me. I saw you mentioned ET uses fbs serialization, we can consider it as an option as well. You could provide conversion java EValue <-> fbs and fbs definition in ET.

fbs is only for runtime. It's over complicated for java serialization use case. Contains other runtime details. You can see the schema here https://github.com/pytorch/executorch/blob/b07be360ae8b52cad60cd5d52c2e71f9c59be81c/schema/program.fbs#L71-L144

qiaoli31 commented 1 week ago

I see. it makes sense.

For training, I propose we reuse this serialized EValue as well. The workflow is as below.

  1. Partner implements onTrainingExample to generate a list of TrainingExampleRecord which has byte[] holding training data. In TFLite, this byte[] is serialized tf.example proto. For executorch, byte[] is serialized EValue[].
  2. We need a custom dataset/dataloader that initial calls to onTrainingExample to read training data one by one (IPC has size limit). In TFLite, model graph has external_dataset custom op did this. The conversion of byte[] -> tf.example proto -> tf.tensor is written in model graph as well.

I found pytorch can define custom dataset/dataloader and IterableDataset + Dataloader looks similar to what external_dataset done.

  1. Is dataset/dataloader available in executorch? Does it have c++ API interface?
  2. can dataset write into executorch model graph? or we need write c++ code in ODP to connect it?
  3. if step (1) above looks good, we will keep existing byte[] for training API.

I'm quite new to pytorch, feel free to suggest other options.

qiaoli31 commented 6 days ago

the comment above is only to start discussion and low priority. Do we have ETA for this serialization format?

kirklandsign commented 6 days ago

Hi @qiaoli31 for serialization, we are working on it. My target is before mid November.

kirklandsign commented 6 days ago

I see. it makes sense.

For training, I propose we reuse this serialized EValue as well. The workflow is as below.

  1. Partner implements onTrainingExample to generate a list of TrainingExampleRecord which has byte[] holding training data. In TFLite, this byte[] is serialized tf.example proto. For executorch, byte[] is serialized EValue[].
  2. We need a custom dataset/dataloader that initial calls to onTrainingExample to read training data one by one (IPC has size limit). In TFLite, model graph has external_dataset custom op did this. The conversion of byte[] -> tf.example proto -> tf.tensor is written in model graph as well.

I found pytorch can define custom dataset/dataloader and IterableDataset + Dataloader looks similar to what external_dataset done.

  1. Is dataset/dataloader available in executorch? Does it have c++ API interface?
  2. can dataset write into executorch model graph? or we need write c++ code in ODP to connect it?
  3. if step (1) above looks good, we will keep existing byte[] for training API.

I'm quite new to pytorch, feel free to suggest other options.

cc @JacobSzwejbka on this. I'm not familiar with training part