-
Does quantized models here become quantized models in ONNX after conversion? Can you even convert/export them to ONNX? How about other way around? Can you export a sparse model to ONNX and quantize in…
-
SUMMARY:
- [x] Avoid full pass through the model for quantization modifier
- [x] Data free `oneshot`
- [x] Runtime of GPTQ with large models – how to do a 70B model?
- [x] Runtime of GPTQ with act…
-
**Describe the bug**
Even on AVX512-VNNI CPUs, sparse int8-quantized models are slow
**To Reproduce**
Steps to reproduce the behavior:
1. Use bagging_sq or single_sq segmentations for inferenc…
-
Hi, I am an engineer from Intel and I work mostly on the performance optimization of PyTorch on intel Xeon CPUs (also I am the pytorch module maintainer for cpu performance). Just come across this…
-
Would it be possible to reformat the Quantized models of MIXTRAL 8x7B to run in sparse /base mode in LLaMA2-Accessory?
-
### 🐛 Describe the bug
I'm trying to build PyTorch on OprangePi PC (H3 Quad-core Cortex-A7) but for some reason I get
```
Error: unknown architecture `armv7-a;'
```
is that semicolon in a wrong …
-
transformers 4.41.2
optimum-quanto 0.2.1
torch 2.3.1
Python 3.10.14
I performed this on a recent google GCP VM with Nvidia driver setup and basic torch sanity test passing.
I tried to quant…
-
### Add Link
https://pytorch.org/tutorials/intermediate/realtime_rpi.html
### Describe the bug
I am getting 25-30fps on my rpi4 with provided snippet.
However, after finetuning mobilenet_v2 …
-
If I use torch=0.2.0, I met the error:
Traceback (most recent call last):
File "example/mpii.py", line 352, in
main(parser.parse_args())
File "example/mpii.py", line 107, in main
tra…
-
Status: Draft
Updated: 09/18/2024
# Objective
In this doc we’ll talk about how different optimization techniques are structured in torchao and how to contribute to torchao.
# torchao Stack Ove…