quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

apache/lucene #13350

Significant drop in recall for int7 scalar quantization usin…

### Description While running some benchmarking tests using [opensearch-benchmark](https://github.com/opensearch-project/opensearch-benchmark) on int8 scalar quantization using some of the standard…

naveentatikonda updated 2 months ago
18
huggingface/optimum-quanto #131

Introduce optimizers

By default quanto implements a simple absmax algorithm to evaluate the scale and zero-point to be used when quantizing QTensor and QBitsTensor. A refactoring is required in order to allow different al…

dacorvo updated 5 months ago
4
ollama/ollama #3213

open /home/house365ai/xxm/model/Qwen1.5-14B-Chat/tokenizer.m…

### What model would you like? my Modelfile FROM /home/house365ai/xxm/model/Qwen1.5-14B-Chat ollama create Qwen1.5-14B-Chat -f Modelfile how solve it?

njhouse365 updated 6 months ago
2
nbasyl/OFQ #2

Is there any analysis on the oscillation problem on the acti…

Nice work in the paper. Besides: 1) Is there any analysis on the oscillation problem on the activation quantization? Since activation 2bit quantization is harder than weight quantization a lot, it i…

brisker updated 5 months ago
1
opengeospatial/ideas #59

Global tiling grid approximating equal-area while maintainin…

Based on recommendations from Testbed 13 Vector Tiles ER ( http://docs.opengeospatial.org/per/17-041.pdf ): A global tiling grid combining the advantages of approximating equal-area while maintaini…

jerstlouis updated 3 years ago
15
vllm-project/vllm #744

Add Support for Quantized Model in VLLM - $500 Reward

We need to add support for the quantized model in the VLLM project. We need this to run a llama quantized model via vllm. This involves implementing quantization techniques to optimize memory usage a…

petrasS3 updated 1 month ago
35
EricLBuehler/mistral.rs #352

dolphin-2.9-mixtral-8x22b.Q8_0.gguf "Error: cannot find tens…

I attempted to run `mistralrs-server` to serve my local copy of `dolphin-2.9-mixtral-8x22b.Q8_0.gguf`. This file isn't available on huggingface because it's broken into four parts [here](https://hugg…

psyv282j9d updated 2 months ago
11
ggerganov/llama.cpp #2047

Question: How to access feature vector of the intermediate l…

# Prerequisites - [Yes] I am running the latest code. Development is very rapid so there are no tagged versions as of now. - [Yes] I carefully followed the [README.md](https://github.com/ggerganov…

sohta94 updated 5 months ago
6
modelscope/ms-swift #907

qwen-32B自我认知训练完 int4量化时报错 assert model_name is not None and …

**问题描述：** 可以正常export模型并推理，但是量化的时候报错，应该是数据集的原因 **命令：** CUDA_VISIBLE_DEVICES=0,1 swift export \ --ckpt_dir "/home/user/sdb1/sft-output/qwen1half-32b-chat/v4-20240510-064821/checkpoint-50/" \…

xudongLi-Alex updated 4 months ago
2
apache/lucene #12497

Add Scalar Quantization codec for Vectors

### Description Having copy-on-write segments lends itself nicely with quantization. I propose we add a new "scalar" or "linear" quantization codec. This will be a simple quantization codec provided …

benwtrent updated 4 months ago
5

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for quantizing

1000+ results
for quantizing