Samsung / ONE

On-device Neural Engine
Other
428 stars 152 forks source link

[tools] Implement temporary block quantization tool #13830

Open hseok-oh opened 3 weeks ago

hseok-oh commented 3 weeks ago

What

Let's implement and maintain Q4_0 and Q8_0 data type weight quantization tool to make test and example model. It is temporary tool, and not for compiler module implementation.

Why

To help onert's LLM support feature development, we need tool to generate weight block quantization tool from fp32 circle test model. It will also help PoC for circle schema update to support LLM model.

hseok-oh commented 3 weeks ago

Tool: #13758 It will not be merged.