quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel/intel-extension-for-pytorch #252

Low accuracy when use ipex + quantize

Hi, I am trying to use ipex to quantize unet model following https://github.com/intel/intel-extension-for-pytorch/blob/v1.12.0/docs/tutorials/features/int8.md. Now the model can be quantized, but the…

rnwang04 updated 3 weeks ago
8
neuralmagic/AutoFP8 #10

FP8 KV cache support

To add quantization support to KV cache, into state dict. Static (as activation) is needed for performance. Dynamic can be added for completeness.

HaiShaw updated 2 months ago
9
NVIDIA/FasterTransformer #265

INT8 Support for GPT models

I see that there is full int8 support (both weights and activations) for BERT, its not clear to me what is supported for GPT models ([here](https://github.com/NVIDIA/FasterTransformer/blob/main/exampl…

bharatv007 updated 1 year ago
25
intel/auto-round #100

OPT model quantize_lm_head clarification

While testing for OPT with `quant_lm_head=True`, here are the result weights post quantize: `weight keys: ['lm_head.g_idx', 'lm_head.qweight', 'lm_head.qzeros', 'lm_head.scales', 'model.decoder.em…

Qubitium updated 3 months ago
3
ml-explore/mlx-examples #155

Contribute Hugging Face models to the MLX Community

We encourage you to join the [MLX Community](https://huggingface.co/mlx-community) on Hugging Face 🤗 and upload new MLX converted models and versions of existing models.

awni updated 6 months ago
20
pytorch/pytorch #95785

Fully quantized model (`torch.quantization.convert`) produce…

### 🐛 Describe the bug ## Description The output of fully quantized and fake quantized models do not match, with the fully quantized model not matching the expected analytical results for a minima…

mylesDoyle updated 6 months ago
4
siliconflow/onediff #761

Quant: only small speedups on A100

### Describe the bug #### A clear and concise description of what the bug is. using quant only makes minimal speedups on A100 ### Your environment #### OS $ uname -a Linux jean-zay4 4.18.…

sandor-lisn updated 4 months ago
7
BlackSamorez/tensor_parallel #88

Question on custom models

Hi, without using transformers / accelerate blablabla, what are the constraints on the model to be tensor paralelizable ? does it need to be a nn.Sequential ? does input dimensions need to be alwa…

vince62s updated 1 year ago
23
pytorch/pytorch #44111

fx: unable to symbolically trace simple nn.Sequential model

## 🐛 Bug Seeing errors when trying to trace simple models based on `nn.Sequential`: ``` Traceback (most recent call last): File "/home/vasiliy/nfs/pytorch_scripts/gm_sequential_bug.py", line…

vkuzo updated 3 weeks ago
10
AutoGPTQ/AutoGPTQ #649

What magnitude of avg loss indicates a relatively good resul…

When i quantize a model, the avg loss is lower in earlier layers(0.02) than the loss in later layers(2.0), i'm curious that if the quantization is failed due to a large avg loss? And for experience, …

ehuaa updated 1 month ago
7

上一页 1...86 87 88 89 90 91 92...100 下一页

1000+ results for quantizing

1000+ results
for quantizing