Samsung / ONE

On-device Neural Engine
Other
429 stars 157 forks source link

Draft: type-aware input & output buffer setting for quantized model #13284

Closed hseok-oh closed 2 months ago

hseok-oh commented 3 months ago
hseok-oh commented 3 months ago

Test (current status)

0) Prepare

1) Generate random golden input/output data

$ ./Product/out/bin/onert_run --dump out.h5 --dump_input:raw in mobilenet_v1_1.0_224.circle

2) Run full quantization and inference with golden input

$ ./Product/out/bin/onert_run --dump out.tvn.h5 --load:raw in --minmax_runs 100 -q uint8 -c tvn-gen \
--force_float --output_shape [0,[1,1001]] mobilenet_v1_1.0_224.circle

3) Compare with golden

$h5diff -d 0.01 -v out.h5 out.tvn.h5

file1     file2
---------------------------------------
    x      x    /              
    x      x    /value         
    x      x    /value/0       

group  : </> and </>
0 differences found
group  : </value> and </value>
0 differences found
dataset: </value/0> and </value/0>
size:           [1x1001]           [1x1001]
position        0               0               difference          
------------------------------------------------------------
[ 0 972 ]          0.980392        0.917886        0.0625058      
1 differences found