Open bommineniravali opened 1 year ago
Hi @bommineniravali
above I have tried still all methods of dlc validation results are matching with fp32 results and it is not matching aimet int8 results. why i am getting high Accuracy with int8 on target cpu than aimet int8 results?
From your description it appears that you are using SNPE or QNN sdk. What backend did you choose to run? Note that the ARM CPU backend will not run the model quantized and you should see close to FP32 accuracy.
After quantization model is not reduced and it remains same as fp32 model and why model size is not decreasing after quantization?
AIMET does not quantize the model. AIMET optimizes the model for quantization so that the model accuracy will be improved when subsequently it is run on a quantized target. So, resulting model from AIMET is still FP32 and you will not apparently see a model size reduction. But when you take the model to a target device, say using the Qualcomm Neural Processing SDK, you should see smaller models when run on target devices and much faster inference.
Hi, It is pytorch resnet50 predefined fp32 model I have taken aimet_resnet50_int8 onnx model (sim.export api) and converted into dlc and validating dlc model with imagenet dataset on target. but i am getting high Accuracy than aimetint8 results. I have used below Techniques on aimetside ->quantization simulation (tf_enhanced) ->CLE ->CLE + addaround +quantsim
above I have tried still all methods of dlc validation results are matching with fp32 results and it is not matching aimet int8 results. why i am getting high Accuracy with int8 on target cpu than aimet int8 results? After quantization model is not reduced and it remains same as fp32 model and why model size is not decreasing after quantization?
Can please suggest on these issues.