Samsung / ONE

On-device Neural Engine
Other
428 stars 152 forks source link

[onert] Support Backend-aware Quantization #10002

Open ragmani opened 1 year ago

ragmani commented 1 year ago

What

Let's support quantization/dequantization between heterogeneous backends in onert

Why

  1. Supports CPU/TRIX heterogeneous computing with clrcle models that cannot be compiled to be used in TRIX backend.
  2. Heterogeneous computing of models that require more accuracy
  3. any other cases?

how far

How to

Discussing at #10003

TODO

chunseoklee commented 1 year ago

Support between multiple models : Unconditional support

"Unconditional" means model without control flow ? or else ?

FLOAT <-> QASYYM8 : Unconditional support

What does "Unconditonal" mean here ?

ragmani commented 1 year ago

@chunseoklee

"Unconditional" means model without control flow ? or else ?

My intention was supporting multiple models regardless of control flow (supporting multiple models both sides with or without controlflow).

What does "Unconditonal" mean here ?

My intention was essential. I changed it in https://github.com/Samsung/ONE/issues/10002#issue-1441527173.

ragmani commented 1 year ago

I tried to add the new builtin operation to subgraph's output corresponding to from of edge. However, I figured out that this way is difficult to support dynamic tensor in our current onert. So, I'm trying to change the way to adding the new built-in operation to the subgraph's input operand corresponding to to of edge.