quantization Search Results

1000+ results
for quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

meituan/YOLOv6 #571

FPS value in Partial Quantization

### Before Asking - [X] I have read the [README](https://github.com/meituan/YOLOv6/blob/main/README.md) carefully. 我已经仔细阅读了README上的操作指引。 - [X] I want to train my custom dataset, and I have read the …

semihhdemirel updated 2 years ago
2
pytorch/ao #64

[New Feature] CUTLASS kernels for w4a8 quantization

We plan to add QAT for LLMs to torchao (as mentioned in the original RFC here https://github.com/pytorch-labs/ao/issues/47) For this to run efficiently on the GPU we'd need kernel support for W4A8…

supriyar updated 6 months ago
4
center-for-humans-and-machines/transformer-heads #14

Couldnt run the model - CohereForAI/aya-expanse-8b

Hi, I have tried running the **CohereForAI/aya-expanse-8b** model. I added the following code to your script ---------------------------------CODE CHANGE 1--------------------------------------…

ArchchanaKugathasan updated 1 day ago
5
TrojanXu/yolov5-tensorrt #42

Whether int8 quantization is supported ？

May I ask whether the current project supports INT8 quantization? If so, how? Currently onlyFT16, FT32 quantification is supported, right?

tensorflowt updated 4 years ago
1
Dao-AILab/flash-attention #1196

[Question]Does training and inference use the same quantizat…

as titled cc @tridao @jayhshah

moses3017 updated 2 months ago
2
NVIDIA/TensorRT-LLM #1922

Support int type zero-points in weight-only GEMM

Currently some quantized huggingface models save zero-points in int4 datatype directly, like [Qwen/Qwen2-7B-Instruct-GPTQ-Int4](https://huggingface.co/Qwen/Qwen2-7B-Instruct-GPTQ-Int4) and [Qwen/Qwen2…

xiaonans updated 1 week ago
5
haotian-liu/LLaVA #996

[Question] Quantization does not improve latency

### Question I did some testing with 4-bit and 8-bit quantization and it doesn't seem to improve inference time at all - in fact, it seems to make it worse. All I did was simply set `load_in_8bit` or…

dzenilee updated 8 months ago
1
lm-sys/FastChat #1043

Alternative Implementation of 8bit Quantization

Hi all, thanks a lot for the nice work introducing Vicuna and FastChat. I am a beginner in NLP (so correct me if I am wrong) and use GPUs with limited memories, so I would like to train/infer with …

Sissel-Wu updated 1 year ago
1
tensorflow/tensorflow #62171

TFLite CNN model quantization error

### 1. System information - OS Platform and Distribution : Ubuntu 22.04.3 LTS - TensorFlow installation: pip install tensorflow (virtual env: venv) - TensorFlow library: pip package -> tensorflow…

Marouan-st updated 1 year ago
6
rdeioris/glTFRuntime #57

KHR_mesh_quantization incorrect scales and positions

When compressing a GLTF model with [gltfpack](https://meshoptimizer.org/gltf/) it appears with incorrect scales in game. Model without quantization: ![image](https://user-images.githubusercontent.…

MrAlvaroRamirez updated 7 months ago
1

上一页 1...88 89 90 91 92 93 94...100 下一页

1000+ results for quantization

1000+ results
for quantization