ragmani commented 1 year ago

What

Let's support quantization/dequantization between heterogeneous backends in onert

Why

Supports CPU/TRIX heterogeneous computing with clrcle models that cannot be compiled to be used in TRIX backend.
Heterogeneous computing of models that require more accuracy
any other cases?

how far

Scope of support
1. Support between multiple models : essential support
2. Support between all other backends : support if necessary
Supported data types
- FLOAT <-> QASYYM8 : essential support
- FLOAT <-> QSYMM8: support If necessary (It seems to be used only in hybrid weight, so it's ambiguous if this case should be supported)
- FLOAT <-> QASYYM8_PER_CHANNERL : support if necessary
- FLOAT <-> QSYMM16 : essential support
- FLOAT <-> QASYMM16 : support if necessary
- Others: support if necessary

How to

Discussing at #10003

TODO

[ ] ~Introduce the new builtin ModelEdge to support permutation between different model.~ Change edge tensors to be permutation. #10296 TBD

chunseoklee commented 1 year ago

Support between multiple models : Unconditional support

"Unconditional" means model without control flow ? or else ?

FLOAT <-> QASYYM8 : Unconditional support

What does "Unconditonal" mean here ?

ragmani commented 1 year ago

@chunseoklee

"Unconditional" means model without control flow ? or else ?

My intention was supporting multiple models regardless of control flow (supporting multiple models both sides with or without controlflow).

What does "Unconditonal" mean here ?

My intention was essential. I changed it in https://github.com/Samsung/ONE/issues/10002#issue-1441527173.

ragmani commented 1 year ago

I tried to add the new builtin operation to subgraph's output corresponding to from of edge. However, I figured out that this way is difficult to support dynamic tensor in our current onert. So, I'm trying to change the way to adding the new built-in operation to the subgraph's input operand corresponding to to of edge.

Samsung / ONE

[onert] Support Backend-aware Quantization #10002

What

Why

how far

How to

TODO