Support QAT in QCOM qnn backend

pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch

https://pytorch.org/executorch/

Other

2.22k stars 368 forks source link

Support QAT in QCOM qnn backend #6212

Open cccclai opened 1 month ago

cccclai commented 1 month ago

🚀 The feature, motivation and pitch

Currently qnn quantizer only supports PTQ (post training quantization), and we'd like to enable QAT (quantization aware trainning) for better quantization support

Alternatives

Use PTQ

Additional context

No response

RFC (Optional)

No response

cccclai commented 1 month ago

Hi @chiwwang, @navsud is our quantization expert and is also looking into QAT for qnn, maybe we can coordinate and enable QAT together.

chiwwang commented 1 month ago

Nice! ++ @chunit-quic , who is prototyping QAT in qnn quantizer.

chiwwang commented 1 month ago

We have a prototype https://github.com/pytorch/executorch/pull/6222, which is more like kickoff for our discussions. It might be incorrect.... QAT is really a new thing for us. So please feel free to advise and give directions! (And...what model is suitable for the 1st QAT target? We need some E2E verifications... 🤔 )