Quantization aware training for DSP using 16 bit activations

quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

https://quic.github.io/aimet-pages/index.html

Other

2.08k stars 373 forks source link

Quantization aware training for DSP using 16 bit activations #1883

Open AlmogDavid opened 1 year ago

AlmogDavid commented 1 year ago

Hi, I'm trying to train a QAT model using 16 bit activations and 8 bit weights in order to run it on DSP on Snapdragon 888. The training works as it should and the model is converging. When running the model on the Snapdragon CPU (after conversion to DLC + Quantize using SNPE SKD) I'm able to see valid results. When moving to run on DSP I see garbage results... Can you point me to a full tutorial on how to perform the flow I'm interested in? I was not able to find any guide on running 16 bit activation model on top of the DSP. This is the quantsim configuration I'm using: quantsim = QuantizationSimModel(model=fld_model.to(self.cfg.device), quant_scheme='tf_enhanced', dummy_input=next(iter(image_dl))[data_features.FACE_IMAGE].to(self.cfg.device), rounding_mode='nearest', default_output_bw=8, default_param_bw=16, in_place=False)

quic-mangal commented 1 year ago

@AlmogDavid Thanks for the query. @quic-akinlawo can you help respond to this? Thanks

quic-akinlawo commented 1 year ago

Hello @AlmogDavid, your quantsim configuration seems to indicate the opposite - it looks like you are using 16-bit weights if you have default_param_bw = 16?

There is no special 16-bit specific flow in either case. The following DSP tutorial should contain all the info you need: https://developer.qualcomm.com/sites/default/files/docs/snpe/tutorial_inceptionv3.html.

AlmogDavid commented 1 year ago

Thanks, i had a typo in my post here. Actually in my code I'm working on 16 bit output and 8 bit weights. I also opened an issue in SNPE, it looks like the DSP does not work in this configuration,

AlmogDavid commented 1 year ago

@quic-akinlawo , Anything?

quic-akinlawo commented 1 year ago

@AlmogDavid From an AIMET point of view, the 16 bit activation and 8 bit weight configuration is supported. If you are seeing accuracy degradation then it could be a specific issue relating to your model running on SNPE DSP. It seems you have already raised an issue with the SNPE DSP team, please follow up with them as they would be best placed to provide assistance/debugging tips.