Obtain statistics from multiple executions for quantisation

Antonio95 commented 10 months ago

I currently have a simple TF Lite model which I successfully converted into an NNEF one using the conversion script from this repository. Now I would like to quantise the NNEF model (into a quantised NNEF model) using the nnef_tools.quantize script. As far as I understand, I first need to run the nnef_tools.execute script with the --statistics flag to generate the scale/zero-point statistics which will be used by the quantisation script. However, I am unsure how to obtain a statistics file with information from, say, 100 inferences. More specifically, the execution script takes an optional --input_path argument where a folder can be specified that contains the input, but that seems to expect only .dat files for a single execution (of course, a model graph could have more than one input node). It seems the script expects each .dat file to have the same name as the corresponding node in the NNEF graph (for instance, external1), which in principle rules out the possibility having 100 different values for that input to run inference on.

Is there a way to achieve what I'm attempting?

Thank you for this fantastic repository!

gyenesvi commented 10 months ago

A single input node and a single .dat file can contain a batch of inputs. If your barch size is 100, you can run it for 100 images for the statistics. You can use the image_tensor.py to generate a .dat file from images.

Antonio95 commented 10 months ago

I was not aware of that, but it works easily exactly as you said. Successfully run, thank you!

gyenesvi commented 10 months ago

Great, no problem!

KhronosGroup / NNEF-Tools

Obtain statistics from multiple executions for quantisation #165