quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ollama/ollama #5889

Please provide Q_2 for Llama 3.1 405B

This quantisation is missing rn...

gileneusz updated 1 month ago
25
LMMS/lmms #2380

Chord delay

In Func menu, Chord section. Let's add a new knob for "delay", that is a delay between notes. When I try to play some octaves or fifths manually and I like them, I often activate the Chord function, …

DeRobyJ updated 2 years ago
35
mobiusml/hqq #24

QuantLinear new feature requests

Hello, I am very impressed with your great work. I am not quite familiar with CUDA programming. Would you please kindly give me an instruction about how to call the pack_2bit_u8 of your optimized CUDA…

Lucky-Lance updated 6 months ago
2
Mozilla-Ocho/llamafile #230

Guidance Needed: Quantizing the llava model

I'm looking to quantize the llava model from fp16.gguf. When I try to quantize llava after compiled llamafile, `app/bin/llava-quantize llava-v1.5-7B-GGUF/llava-v1.5-7b-mmproj-f16.gguf llava-v1.5-7B-…

lg123666 updated 7 months ago
2
ml-explore/mlx-examples #718

Loss nan for phi-3

When I try to finetune phi-3 (Phi-3-mini-128k-instruct-8bit) I get the same issue I previously had for mixtral with a loss nan `Trainable parameters: 0.042% (1.573M/3750.282M) Loading datasets Tr…

l0d0v1c updated 4 months ago
6
armbues/SiLLM #5

slowness of sillm.chat on M2 Air with 16GB Ram

Any invocation of python -m sillm.chat model seems much slower on my machine than in the reference video--more than a minute to get to the prompt, and maybe 1-2 TPM in the response. I have tried si…

kylewadegrove updated 4 months ago
5
mobiusml/hqq #8

Collaboration: Unsloth + HQQ

Hey HQQ team! Happy New Year! I actually found out about HQQ from some Reddit posts about Mixtral - and had a look at https://github.com/mobiusml/hqq/issues/2 which was super insightful! Quantizing…

danielhanchen updated 4 months ago
12
casper-hansen/AutoAWQ #450

Llama-3 support

I am not able to quantized these new Llama-3 models: ``` AWQ: 3%|███████▊ …

maziyarpanahi updated 5 months ago
4
vllm-project/vllm #2836

how to create LLM() object given a model and a tokenizer?

Hello! I'm wondering if it's possible to load a `model` and a `tokenizer`, and then pass the two of them to `vllm.LLM()` to create an object. The reason I am trying to create the object this way (inst…

th789 updated 5 months ago
1
sony/mct_quantizers #77

ActivationPOTQuantizer np.inf - onnx torch inconsistency

when passing mp.ing to `ActivationPOTInferableQuantizer` some time is going to max ml and sometime to min ml this true not always true I have different result in linux docker and mac ### test cod…

Chizkiyahu updated 5 months ago
1

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for quantizing

1000+ results
for quantizing