-
Hi,
Thanks very much for your work and for publishing your code. I am currently working on integration of SpinQuant into [torch/ao](https://github.com/pytorch/ao/pull/983/), and I would like to cla…
-
-
I recently tried to save the model after R @ Weight obtained by SpinQuant, and found that the quantization effect was not improved much after hadamard_utils.get_hadK did not match the appropriate size…
-
### 🐛 Describe the bug
Currently I'm trying to test LLaMA 3.2 3B Instruct Model as you guided.
but, I faced some issues during pte generation for LLaMA 3.2 3B Instruct Model with QNN @ On Device sid…
-
### 🐛 Describe the bug
After https://github.com/pytorch/executorch/issues/6284#issuecomment-2423431020 patch that original UTF-8 invalid character issue had fixed,there is a new issue in tensor type …
-
### Right Case
When I follow the doc : https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md#enablement,
I export the Llama3.2-1B-Instruct:int4-spinquant-eo8 model to xnnpa…
-
A new, interesting quantization scheme was published, which not only reduces memory consumption (like current quantization schemes), but als reduces computations.
> **[QuaRot: Outlier-Free 4-Bit In…
-
### 📚 The doc issue
I use this command transform model(Llama-3.2-1B)
```
python -m examples.models.llama.export_llama --checkpoint "${MODEL_DIR}/consolidated.00.pth" -p "${MODEL_DIR}/params.json" -…
-
I am follwing the [instructions in the Llama2 README](https://github.com/pytorch/executorch/blob/d9aeca556566104c2594ec482a673b9ec5b11390/examples/models/llama2/README.md#instructions) to test llama m…
-
Hello, everyone!
Thank you for the contributions to the quantization works!
I would like to discuss with you some issues regarding reproducing the results of the SpinQuant paper.
I use the code fro…