quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

bitsandbytes-foundation/bitsandbytes #1283

Clarifying the quantization algorithm

Where in the codebase might I find the basic arithmetic / steps for quantizing with NF4? I’ve had trouble finding a clear definition of the math in existing tutorials, but based on what I see in th…

chrisjmccormick updated 1 month ago
1
tensorflow/model-optimization #377

QAT (quantization aware training) Support quantizing models …

**Describe the bug** I'm doing transfer learning and would like to (at the end) quantize my model. The problem is that when I try to use the _quantize_model()_ function (which is used successfully in…

CRosero updated 2 years ago
24
NVIDIA/TensorRT-LLM #1078

Model not converted to dtype when quantizing, causes engine …

I'm using the quantization script in examples/quantization and I'm running into an issue where I'm quantizing Mistral 7B to int4_awq and since Mistral 7B is bfloat16, I need to use bfloat16 dtype in t…

Broyojo updated 6 months ago
1
NVIDIA/TensorRT #4079

Accuracy loss of TensorRT 8.6 when running INT8 Quantized Re…

## Description When performing Resnet18 PTQ using TRT-modelopt, I encountered the following issue when compiling the model with TRT. First off, I started with a pretrained resnet18 from torchvi…

YixuanSeanZhou updated 9 hours ago
8
majianjia/nnom #68

Large performance drop after quantizing my own depthwise mod…

I followed your example `auto_test` with my own depthwise deparable CNN. After a few epochs of training my Keras model has an accuracy of 98.12% on the MNIST test set. After quantization the NNoM mode…

muskedunder updated 5 years ago
3
vllm-project/vllm #4714

[Bug]: export failed when kv cache fp8 quantizing Qwen1.5-72…

### Your current environment pip3 install vllm==0.4.2 nvidia-ammo==0.7.1 Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: …

frankxyy updated 4 months ago
1
catboost/catboost #2737

Internal CatBoost Error (contact developers for assistance):…

Problem: When running the get_feature_importance, it fails with the following error. ``` CatBoostError: /src/catboost/catboost/private/libs/algo/features_data_helpers.h:118: Internal CatBoost E…

timpiperseek updated 1 day ago
4
mlabonne/llm-course #85

File not found error while using GGUF in AutoQuant

Hey, i want to quantize my Qwen2 model but it seems then the files are not found even though it clones and installs llama.cpp correctly. When quantizing the mode i get this: ```txt python3: can't …

Goekdeniz-Guelmez updated 2 months ago
1
pytorch/torchtune #1000

Apply QLoRA to output projections and token embedding

Currently, we don't apply QLoRA to either the output projection or token embeddings. There's no great reason to not apply quantization to output projections, we simply don't do this due to limitations…

rohan-varma updated 4 weeks ago
4
mit-han-lab/smoothquant #42

Accuracy drop for Llama

I tried to quantize a Llama model (Llama 13b) by smooth quant, and found that if I only quantize `LlamaDecoderLayer` then the accuracy would not drop even directly quantize weights and activations, bu…

fmo-mt updated 1 month ago
9

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for quantizing

1000+ results
for quantizing