quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

UKPLab/sentence-transformers #3031

Support for Quantization Aware Training

Is it possible to perform `Quantization Aware Training` on Sentence Transformers, beyond [fp16 and bf16](https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py#L404-L4…

lejinvarghese updated 3 days ago
1
AutoGPTQ/AutoGPTQ #170

Why losses are higher for GPTJ than LLama?

I find that the quantisation losses are higher for GPTJ than LLama which seems to stay pretty low. ``` 2023-06-20 19:05:19 INFO [auto_gptq.modeling._base] Quantizing attn.q_proj in layer 2/28... …

ri938 updated 2 months ago
1
state-spaces/mamba #133

Quantization

Hi, Have you tried quantizing Mamba? Do you plan on releasing quantized versions? Can you share your thoughts on quantizing Mamba, given the sensitivity of the model's recurrent dynamics? Thanks

arman-kazemi updated 2 months ago
6
tensorflow/model-optimization #853

quantizing to int values

Hello, I have used your QAT model to quantize to different bitwidths, but I saw that the quantizations were always to FP values, even if they were quantized (e.g., if I quantized to 4bit, then all my…

lovodkin93 updated 8 months ago
2
amd/RyzenAI-SW #127

onnxruntime version

Dear Authors, Thanks for the great job. After installing "ryzen-ai-1.2.0-20240726.msi", I can run with NPU under the target platform. However, there are some questions I would like to verify. …

waquey updated 1 week ago
3
casper-hansen/AutoAWQ #571

error when quantizing my finetuned 405b model using autoawq

Package Version: AutoAWQ: 0.2.5+cu118 torch: 2.3.1+cu118 transformers: 4.43.3 I was try to quantize my finetuned llama3.1 405b (bf16) model to 4 bit using autoawq following the insturction in t…

Atomheart-Father updated 3 months ago
7
Qcompiler/QComplier #6

optimize device memory

I found that the device memory usage keeps increasing when execute basic_quant_mix.py, it will raise OOM when model has large parameters, so, how to optimize it. Thank you~ @Qcompiler

Godlovecui updated 1 week ago
1
adriananders/Quix-Audio-Recorder-M4L #1

Quantizing the recoding trigger?

Hi, First of all many many thanks for this device, is amazing, and just thanking you for this brick you have added to this massive construction set called Ableton. Just an issue, is there a way …

freddieventura updated 3 years ago
1
AutoGPTQ/AutoGPTQ #634

Error when quantizing mixtral 8x7b model. "ZeroDivisionErr…

I am getting "float division by zero" error whenever I try to quantize mixtral related models with autogptq, and here is my code. ``` from transformers import AutoTokenizer, TextGenerationPipeli…

arceus-jia updated 1 month ago
3
microsoft/onnxruntime-genai #1051

Builder output with quantization enabled yields incorrect ne…

#**Describe the bug** Quantization scales are defined to _always_ be positive in the [onnx documentation](https://iot-robotics.github.io/ONNXRuntime/docs/performance/quantization.html). Creating a qd…

aendk updated 1 day ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for quantizing

1000+ results
for quantizing