-
Had this idea and discussed briefly with @andrewor14.
Conceptually the current QAT + FSDP looks like this
- sharded FP32 weight -> all-gather in BF16 -> fake quantize
However, we can do low-…
-
I'm currently using 1.29 AIMET. when I try to do QAT using per_channel config, the time spent is almost 50x longer than using default config. Is there a solution?
-
**问题描述**
根据PaddleDetection提供的[模型压缩文档](https://github.com/PaddlePaddle/PaddleDetection/blob/8377e846439a709f5ab3ac6948d768221b5cf1e6/configs/slim/README.md),我将训练好的ppyoloe plus s模型进行量化训练。量化后的模型,以及导出为…
-
### 💡 Your Question
I have followed exactly same steps for model training followed by PTQ and QAT mentioned in the offcial super-gradient notebook :
https://github.com/Deci-AI/super-gradients/blob…
-
i try to use QAT to quantize qwen2 1.5B model
The error raise from function `training.load_from_full_model_state_dict(
model, model_state_dict, self._device, self._is_rank_zero, strict=T…
-
### 🚀 The feature, motivation and pitch
Currently qnn quantizer only supports PTQ (post training quantization), and we'd like to enable QAT (quantization aware trainning) for better quantization supp…
-
Thanks for your contributions. I see you implement and eval qat YOLOv9 size c, e . Can I qat YOLOv9 model size m,s, n and what performance is it?
-
Currently torchao QAT has two APIs, [tensor subclasses](https://github.com/pytorch/ao/blob/a4221df5e10ff8c33854f964fe6b4e00abfbe542/torchao/quantization/prototype/qat/api.py#L41) and [module swap](htt…
-
in our CI setup for HW testing, we want QAT Engine to return an error code if the HW fails. Currently, we build a custom engine, with
```
--disable-qat_sw \
--with-qat_engine_id=qathwtest && \…
mythi updated
2 weeks ago
-
I try the original QAT code.
```
model = llama3(
vocab_size=4096,
num_layers=16,
num_heads=16,
num_kv_heads=4,
embed_dim=2048,
max_seq_len=2048,…