quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

bigcode-project/starcoder2 #5

Official Support for GGUF Quantization in BigCode Starcoder2…

Dear BigCode team, what a wonderful project! I am writing this feature request for official implementation of GGUF quantization for Starcoder2 to enhance its adoption with coding platforms and APIs…

babycommando updated 5 months ago
4
apache/lucene #12497

Add Scalar Quantization codec for Vectors

### Description Having copy-on-write segments lends itself nicely with quantization. I propose we add a new "scalar" or "linear" quantization codec. This will be a simple quantization codec provided …

benwtrent updated 4 months ago
5
apple/coremltools #2214

Adding BitNet Layer Support to CoreML

## 🌱 Describe your Feature Request I am requesting the incorporation of a BitNet layer in CoreML, similar to the PyTorch implementation by Kyegomez (https://github.com/kyegomez/BitNet). A BitNet l…

mromanuk updated 4 months ago
1
PaddlePaddle/benchmark #275

Optimize inference performance of ERNIE INT8 on CPU

Now, paddle ERNIE fp32 inference on CPU performance is ass below: single thread： 251.464 m 20 threads：29.8818 ms Our goal is to prove that with INT8 real kernel, ERNIE can get the performance gain.…

bingyanghuang updated 4 years ago
26
ollama/ollama #4529

error loading model: error loading model vocabulary: unknown…

### What is the issue? I carefully read the contents of the readme's documentation to try and found that something went wrong time=2024-05-20T10:06:02.688+08:00 level=INFO source=server.go:320 msg…

Anorid updated 2 months ago
22
ml-explore/mlx-examples #876

gemma-2-27b-it-4bit generate only <pad>

Run ``` python3 -m venv .venv source .venv/bin/activate python3 -m pip install -U mlx-lm python3 -c "import mlx_lm; print(mlx_lm.__version__)" MODEL=mlx-community/gemma-2-27b…

katopz updated 2 months ago
11
unslothai/unsloth #356

save_pretrained_gguf method RuntimeError: Unsloth: Quantiza…

![image](https://github.com/unslothai/unsloth/assets/1203957/969e6356-6e32-494e-9dc5-7cef6b261a6d) [/usr/local/lib/python3.10/dist-packages/unsloth/save.py](https://localhost:8080/#) in save_to_ggu…

weedge updated 5 months ago
8
endomorphosis/ipfs_transformers_py #1

support for wasmedge models?

at @onefact we have been using wasm, but this won't work for the encoder-only or encoder-decoder models i've built (e.g. http://arxiv.org/abs/1904.05342). that's because the wasm vm is for the cpu (ha…

jaanli updated 3 weeks ago
36
huggingface/optimum-quanto #137

QLinear quantised scale tensor

Dear all, I have noticed that the quantised weight of QLinear module are QTensors with a scale parameter of dimension out_features. Should it not be a scalar value in the case of linear modules (p…

mdatres updated 5 months ago
4
huggingface/transformers #29421

Unused smoothing scales when loading AutoAWQ checkpoints

### System Info - `transformers` version: 4.36.0 - Platform: Linux-5.15.0-1041-aws-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.20.3 - Safetensors version: 0.4.1…

Vatshank updated 4 months ago
5

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for quantizing

1000+ results
for quantizing