model-quantization Search Results

1000+ results
for model-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

onnx/neural-compressor #39

Model size check doesn't seem to work

Even with my model that is less than 250KB in size, I get the onnx_data file after quantization. https://github.com/onnx/neural-compressor/blob/aabbf967cf7ea91c078c28c7b4dab043add5257b/onnx_neural…

kleiti updated 2 weeks ago
5
ultralytics/ultralytics #17277

Create YOLO object from torch.nn.Model

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

EchoLynx updated 3 weeks ago
2
apache/lucene #14007

Can we store only quantized vectors to reduce disk footprint…

### Description In light of optimizing disk usage for KNN vector searches in Lucene, I propose considering a new KnnVectorsFormat class in Lucene that handles only quantized vectors, eliminating the …

Rassyan updated 1 day ago
11
QwenLM/Qwen2.5 #1049

[Bug]: AttributeError: Model Qwen2ForCausalLM does not suppo…

### Model Series Qwen2.5 ### What are the models used? Qwen/Qwen2.5-1.5B-Instruct ### What is the scenario where the problem happened? [inference with] with [vllm] ### Is this a known issue? - …

yananchen1989 updated 3 weeks ago
1
vllm-project/vllm #3226

vllm load SqueezeLLM quantization model failed

### This is my env version: ``` torch:2.2.1 transformers: 4.39.0.dev0 vllm: custom compile at master@24aecf421a4ad5989697010963074904fead9a1b ``` ### I use SqueezeLLM quantization my llama-7B tr…

zuosong-peng updated 1 month ago
6
pytorch/ao #1195

Add codebook (look up table based) quantization flow in torc…

Similar to affine quantization, we can implement codebook or look up table based quantization, which is another popular type of quantization, especially for lower bits like 4 bits or below (used in ht…

jerryzh168 updated 1 week ago
2
vllm-project/llm-compressor #73

Llava model quantization seems not be supported

**Describe the bug** When I use llm-compressor to quantize llava model, but at the begining, it failed. (Unrecognized configuration class: 'transformers.models.llava.configuration_llava.LlavaConfig'…

caojinpei updated 1 month ago
5
SlightwindSec/slightwindsec.github.io #1

posts/modelquantization/quantization-impact-on-model-accurac…

# Quantization Impact on Model Accuracy | Slightwind Mistral-7B’s performance on 5-shot MMLU 如果对测试细节不感兴趣，只需要看下面给出的汇总表格即可。 Overview 量化/非量化版本的 Mistral-7B-v0.1 模型在 5-shot MMLU 上的表现： Quant Type Compute D…

utterances-bot updated 2 months ago
1
ultralytics/yolov5 #13404

problem with int8 quantization of tensorrt for models traine…

### Search before asking - [X] I have searched the YOLOv5 [issues](https://github.com/ultralytics/yolov5/issues) and found no similar bug report. ### YOLOv5 Component Export ### Bug Hello When …

skynn1128 updated 2 weeks ago
2
vllm-project/vllm #7063

hope that we can use multi-GPU directly in vllm for BitAndBy…

### 🚀 The feature, motivation and pitch I am working on the quantization scheme of the large model BitAndBytes, the quantization is very smooth when using transformers, but the inference speed is sti…

jiangchengchengark updated 8 hours ago
7

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for model-quantization

1000+ results
for model-quantization