Why checkpoint.pth on the output folder is not in compliance with true model size?

Eurus-Holmes commented 2 years ago

❓ Questions

For example, when I fine-tune pretrained Vit model with LSQ on CIFAR-10 dataset, the output True model size is 41.20 MB, but on the ./outputs folder, the checkpoint.th is 686 MB, why is not in compliance with true model size?

adefossez commented 2 years ago

The checkpoint file contains everything required for training. In particular, during training, weights are kept in float32 (this is required as otherwise the small updates from the gradient would never lead to a change of quantized value), as well as all the state of the optimizer (momentum, squared gradients etc). In order to get a shippable small model, you must call at the end of training solver.quantizer.get_quantized_state(). If you torch.save what is returned, it should have the expected model size.

Eurus-Holmes commented 2 years ago

Got it, thanks!

facebookresearch / diffq

Why checkpoint.pth on the output folder is not in compliance with true model size? #11

❓ Questions