quantization Search Results

1000+ results
for quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #1345

Quantization for V100

Similar to #1252 , do we have any plans for supporting V100. For now I can see that the place need to be modified is ldmatrix instruction and m16n8k16, as an example we may need to load the matrix man…

Fangzhou-Ai updated 3 weeks ago
11
michaelfeil/infinity #424

Abstraction for `resolve_torch_dtype_device(dtype: Dtype, de…

### Feature request Too much boilerplate template: Resolves loading, quantization, and device Eg. if device: auto -> torch.cuda.is_available() -> cuda or mps. dtype: float32 -> float32, no q…

michaelfeil updated 1 month ago
1
invoke-ai/InvokeAI #6939

[bug]: 5.0 release ignores quantization

### Is there an existing issue for this problem? - [X] I have searched the existing issues ### Operating system Windows ### GPU vendor Nvidia (CUDA) ### GPU model rtx 4090 ### GPU VRAM 24g #…

zethfoxster updated 2 weeks ago
3
karpathy/llama2.c #277

Quantization Brainstorming

There are several experiments being done with this repo to understand and evaluate the effects of quantization on the `llama2.c` models. It is a great test-bed to analyze the effects of varying app…

byte-6174 updated 2 months ago
23
Deci-AI/super-gradients #2060

yolo nas Post training quantization and quantization awarene…

### 💡 Your Question I have followed exactly same steps for model training followed by PTQ and QAT mentioned in the offcial super-gradient notebook : https://github.com/Deci-AI/super-gradients/blob…

anazkhan updated 2 weeks ago
10
PygmalionAI/aphrodite-engine #792

[New Method]: VPTQ, Vector Post-Training Quantization

### The quantization format Hi all, We have recently designed and open-sourced a new method for Vector Quantization called Vector Post-Training Quantization (VPTQ). Our work is available at [VPTQ…

YangWang92 updated 3 weeks ago
2
lancedb/lancedb #1822

Feature: Any plans to support Binary Vector Embeddings?

### SDK Python ### Description - From https://huggingface.co/blog/embedding-quantization: _Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval_ - Also from https…

asmith26 updated 5 days ago
1
pytorch/pytorch #140205

Get `aot_autograd`'ed graph without `torch.compile` and free…

### 🚀 The feature, motivation and pitch I am trying to implement eager mode of PT2E quantization on CPU. Currently, the PT2E quantization on CPU is lowered to Inductor by `torch.compile`. The current…

Xia-Weiwen updated 1 week ago
3
instructlab/training #29

Support quantization beyond 4 bits

We currently only support 4-bit quantization via BitsAndBytes. We should support other options such as 8-bit, (potentially) 6-bit, etc.

RobotSail updated 1 week ago
1
vllm-project/llm-compressor #865

Output of Compressor unable to be to be loaded by latest HF …

**Describe the bug** When using the preset W8A8 recipe from llm-compressor, the output results in a model config.json that fails validation when loaded by HF Transformers. This is a dev version of Tr…

hyaticua updated 1 month ago
2

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for quantization

1000+ results
for quantization