per channel QAT is really slow

quic / aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

https://quic.github.io/aimet-pages/index.html

Other

2.15k stars 383 forks source link

per channel QAT is really slow #3508

Open shuyuan-wang opened 3 days ago

shuyuan-wang commented 3 days ago

I'm currently using 1.29 AIMET. when I try to do QAT using per_channel config, the time spent is almost 50x longer than using default config. Is there a solution?

quic-mtuttle commented 1 day ago

Hi @shuyuan-wang, could you provide a bit more information on how you are instantiating your QuantizationSimModel and running QAT (e.g., which QuantScheme you are using)?