model-quantization Search Results

1000+ results
for model-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/onnxruntime #22640

AttributeError: FLOAT8E4M3FN during quantization

### Describe the issue The preprocess step for quantization does not work with the latest onnxruntime version: ```shell python -m onnxruntime.quantization.preprocess --input image_resize.onnx --outp…

maaft updated 3 weeks ago
1
vllm-project/vllm #9913

[Bug]: awq marlin error for deepseek v2 lite

### Your current environment vllm==0.6.3.post1 ### Model Input Dumps ```bash ValueError: Weight input_size_per_partition = 10944 is not divisible by min_thread_k = 128. Consider reducing tensor_pa…

TechxGenus updated 1 week ago
1
microsoft/onnxruntime-genai #1089

awq example runs into error with llama 3.2 3b due to embeddi…

**Describe the bug** When I run the example from examples/python/awq-quantized-model.md, but switching out phi-3 for llama-3.2-3b, I get an error message stating that `AttributeError: 'NoneType' objec…

tranlm updated 2 days ago
2
onnx/neural-compressor #39

Model size check doesn't seem to work

Even with my model that is less than 250KB in size, I get the onnx_data file after quantization. https://github.com/onnx/neural-compressor/blob/aabbf967cf7ea91c078c28c7b4dab043add5257b/onnx_neural…

kleiti updated 1 week ago
5
vllm-project/vllm #8799

[Bug]: Loading a model with bitsandbytes 8bit quantization

### Your current environment Collecting environment information... PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Debia…

IkhlasAlhussien updated 1 month ago
2
QwenLM/Qwen2.5 #1049

[Bug]: AttributeError: Model Qwen2ForCausalLM does not suppo…

### Model Series Qwen2.5 ### What are the models used? Qwen/Qwen2.5-1.5B-Instruct ### What is the scenario where the problem happened? [inference with] with [vllm] ### Is this a known issue? - …

yananchen1989 updated 3 weeks ago
1
ultralytics/ultralytics #17277

Create YOLO object from torch.nn.Model

### Search before asking - [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussion…

EchoLynx updated 2 weeks ago
2
apache/lucene #14007

Can we store only quantized vectors to reduce disk footprint…

### Description In light of optimizing disk usage for KNN vector searches in Lucene, I propose considering a new KnnVectorsFormat class in Lucene that handles only quantized vectors, eliminating the …

Rassyan updated 3 days ago
10
pytorch/ao #1195

Add codebook (look up table based) quantization flow in torc…

Similar to affine quantization, we can implement codebook or look up table based quantization, which is another popular type of quantization, especially for lower bits like 4 bits or below (used in ht…

jerryzh168 updated 1 week ago
2
unslothai/unsloth #1310

what was the quantisation algorithm used in unsloth/Llama-3.…

what was the quantisation algorithm used in unsloth/Llama-3.2-1B-bnb-4bit model: https://huggingface.co/docs/transformers/main/en/quantization/overview. Is it int4_awq or int4_weightonly ?

jayakommuru updated 3 days ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for model-quantization

1000+ results
for model-quantization