sparse-quantized-models Search Results

pytorch/ao #777

How does this work with ONNX export and quantization?

Does quantized models here become quantized models in ONNX after conversion? Can you even convert/export them to ONNX? How about other way around? Can you export a sparse model to ONNX and quantize in…

ogencoglu updated 2 months ago

vllm-project/llm-compressor #30

Q3 ROADMAP

SUMMARY: - [x] Avoid full pass through the model for quantization modifier - [x] Data free `oneshot` - [x] Runtime of GPTQ with large models – how to do a 70B model? - [x] Runtime of GPTQ with act…

robertgshaw2-neuralmagic updated 1 week ago

clementpoiret/HSF #22

Slow sparse quantized models

**Describe the bug** Even on AVX512-VNNI CPUs, sparse int8-quantized models are slow **To Reproduce** Steps to reproduce the behavior: 1. Use bagging_sq or single_sq segmentations for inferenc…

clementpoiret updated 2 years ago

neuralmagic/deepsparse #1666

[Question] About CPU performance

Hi, I am an engineer from Intel and I work mostly on the performance optimization of PyTorch on intel Xeon CPUs (also I am the pytorch module maintainer for cpu performance). Just come across this…

mingfeima updated 2 days ago

Alpha-VLLM/LLaMA2-Accessory #142

Support for Quantized MIXTRAL 8x7B models in sparse / base m…

Would it be possible to reformat the Quantized models of MIXTRAL 8x7B to run in sparse /base mode in LLaMA2-Accessory?

PlanetMacro updated 9 months ago

pytorch/pytorch #86989

Error: unknown architecture `armv7-a;' and Error: selected p…

### 🐛 Describe the bug I'm trying to build PyTorch on OprangePi PC (H3 Quad-core Cortex-A7) but for some reason I get ``` Error: unknown architecture `armv7-a;' ``` is that semicolon in a wrong …

TByte007 updated 2 months ago

huggingface/optimum-quanto #217

Inference from a reload quantized open clip model (by .load_…

transformers 4.41.2 optimum-quanto 0.2.1 torch 2.3.1 Python 3.10.14 I performed this on a recent google GCP VM with Nvidia driver setup and basic torch sanity test passing. I tried to quant…

kechan updated 4 weeks ago

pytorch/tutorials #2487

[BUG] No ways provided to replicate fps on retrained models.

### Add Link https://pytorch.org/tutorials/intermediate/realtime_rpi.html ### Describe the bug I am getting 25-30fps on my rpi4 with provided snippet. However, after finetuning mobilenet_v2 …

Huxwell updated 1 year ago

eladhoffer/convNet.pytorch #9

quantization

If I use torch=0.2.0, I met the error: Traceback (most recent call last): File "example/mpii.py", line 352, in main(parser.parse_args()) File "example/mpii.py", line 107, in main tra…

wm901115nwpu updated 5 years ago

pytorch/ao #391

[RFC] torchao Contributor Guide

Status: Draft Updated: 09/18/2024 # Objective In this doc we’ll talk about how different optimization techniques are structured in torchao and how to contribute to torchao. # torchao Stack Ove…

jerryzh168 updated 2 weeks ago

363 results for sparse-quantized-models

363 results
for sparse-quantized-models