Question on quantization size

Hey Team, First of all thank you for all your wonderful work on quantizing models for the community.

I have some questions on quantization using sparseml and sparsezoo. I have been trying to perform Sparse Transfer Learning With a Custom Dataset mainly using yolov8s model as below

!sparseml.ultralytics.train \
  --model "zoo:cv/detection/yolov8-s/pytorch/ultralytics/coco/pruned65-none" \
  --recipe "zoo:cv/detection/yolov8-s/pytorch/ultralytics/voc/pruned65_quant-none" \
  --data /content/datasets/Sphero-Robot-detection-8/data.yaml \
  --recipe_args '{"num_epochs":15, "qat_start_epoch": 10, "observer_freeze_epoch": 12, "bn_freeze_epoch": 12}' \
  --batch 8

My main question is that usual .pt files of yolov8s models before and after training are of the size in range of 20-30 MB, but when using the above recipe, the model that is getting downloaded are of the size in the range of 120 - 130 MB, I was under the impression that models being quantized and pruned, should usually of smaller sizes in the range of 6-8 MB as shown in Zoo.

Am I doing something wrong or is this usual?

Thanking you in advance, Raju

neuralmagic / deepsparse

Question on quantization size #1429