[onert] Type-aware quantized model input/output buffer setting

Samsung / ONE

On-device Neural Engine

Other

429 stars 157 forks source link

[onert] Type-aware quantized model input/output buffer setting #13287

Closed hseok-oh closed 2 months ago

hseok-oh commented 3 months ago

What?

If user load quantized model, user can set float type input buffer and data, output buffer. Then runtime can quantize input to read input data and dequantize output data to write output buffer.

Why?

If user load on-device quantized model, user cannot know exact quantized input and output buffer's parameter. So we need to support type-aware input/output buffer setting

hseok-oh commented 3 months ago

Draft: #13284

hseok-oh commented 2 months ago

Done

Test: https://github.com/Samsung/ONE/pull/13284#issuecomment-2202648833