microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14.06k stars 1.82k forks source link

What is the correct order to use DistributedDataParallel and QAT Quantizer? #5698

Open neesetifa opened 1 year ago

neesetifa commented 1 year ago

Describe the issue:

Environment:

Configuration:

Log message:

How to reproduce it?: I'm trying to do QAT with DDP, but I'm confused with the order of initializing optimizer. According to Pytorch official code, definition of optimizer should happen after wrapping model in DDP. But in NNI, https://github.com/microsoft/nni/blob/master/nni/compression/quantization/qat_quantizer.py this example shows that we should have optimizer first, pass it into evaluator, then let QAT Quantizer wrap the model. I can't find any example code for DPP+QAT, could anyone help?