quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ultralytics/ultralytics #8587

1.58 bit quantization

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar feature requests. ### Description 1.58 bit quantization i…

joelwebb updated 4 months ago
7
vllm-project/vllm #4025

[Feature]: Support for 4-bit KV Cache in paged-attention op

### 🚀 The feature, motivation and pitch # Summary We would like to support the 4-bit KV cache for the decoding phase. The purpose of this feature is to reduce the GPU memory usage of the KV cache wh…

yukavio updated 4 months ago
5
InternLM/lmdeploy #469

[Question] When will lmdploy support code llama quantization…

### Motivation In the code-llama's deploy tutorial, quantization chapter remains to be done, when will this feature finished? ### Related resources _No response_ ### Additional context _No respon…

gesanqiu updated 3 months ago
8
ultralytics/ultralytics #12768

can anyone guide to do quantization for custom trained yolov…

please suggest some sources. i have tried several sources but nothing works for me.

218w1d7706 updated 2 months ago
5
Cornell-RelaxML/quip-sharp #44

Group-wise Quantization

Hi, I understand that currently you are quantizing the model weights in a **per-row** fashion. Can you extend QuIP# to **per-group** granularity? Can you elaborate on why and why not? Thanks

arman-kazemi updated 6 months ago
1
turboderp/exllamav2 #516

Dumb quantize/selective recompile/recapitation?

tldr: can we get a way to bypass calibration/measurement and save a 'calibration.json'? Not to produce better models so much but to patch/hack them. Does this belong in *issues*? I think at least a…

tau0-deltav updated 2 months ago
1
ggerganov/llama.cpp #4165

GPTQ / ExLlamaV2 (EXL2) quantisation

# Feature Description Please provide a detailed written description of what you were trying to do, and what you expected `llama.cpp` to do as an enhancement. # Motivation It sounds like it's …

0xdevalias updated 2 months ago
10
VITA-Group/LightGaussian #15

Error about quantizing 3DGS checkpoint

Hi, I get an error when I run `vectree.py`. > ================== Print Info ================== Input_feats_shape: torch.Size([1554770, 62]) VQ_feats_shape: torch.Size([1554770, 27]) SH_degree: …

Aeson-Hsu updated 7 months ago
1
google-deepmind/recurrentgemma #6

Any method for converting to gguf?

Is there any method to convert Griffin models to gguf? I want to quantize this model to q4_K type Any kind of help is appreciated Thanks

Meshwa428 updated 4 months ago
3
huggingface/optimum-quanto #143

Add Percentile Optimizer

Percentile Optimizer is a commonly used calibration method aimed at activation calibration. We need to make QModuleMixin support the optimizer for activation first

shuokay updated 5 months ago
6

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for quantizing

1000+ results
for quantizing