-
-
Fp8 or AWQ quant
-
I'd like to raise a concern about how quantization is currently handled in SpeechBrain. While training my own k-means quantizer on the last layer of an ASR model, I noticed that the interface was not …
-
Provide an approach allowing to finetune LLM models using lora more efficiently.
-
Could you please provide the code for training quantization aware accuracy predictor or creating dataset for quantization aware accuracy predictor?
-
I have migrated the method to the qwenvl model and evaluated it using VLMEvalKit for certain visual tasks under the int2 condition. The specific link is as follows:[quip-sharp-qwenvl](https://github.c…
-
https://developer.nvidia.com/zh-cn/blog/nvidia-tensorrt-llm-revs-up-inference-for-google-gemma/
This post says gemma supports quantization, so does recurrentgemma support quantization?
-
Implement this paper: https://arxiv.org/abs/2405.12497 as a new quantization type
-
Hello, thank you for sharing the source code of your work. I would like to understand this error related to annotation, which occurred across all the different models :
------------------------------…
-
### 🚀 The feature, motivation and pitch
With a single command, quantize the same model across every available quant scheme and configuration and output a table that compares the results. This will …
byjlw updated
3 weeks ago