quantizing Search Results

1000+ results
for quantizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

voidful/hubert-pretrain #1

dbscan_kmeans.py is empty

File dbscan_kmeans.py does not contain code for quantizing speech units. In fact it is empty.

benob updated 1 month ago
1
pjreddie/darknet #556

Quantizing YOLO weights

Looking for a way to quantize YOLO weights (to 8-bits or 16 bits). My idea is to speed up calculations as much as possible without hurting accuracy too much so I would like to experiment with that to…

arturioxas updated 6 years ago
3
NVIDIA/TensorRT-LLM #1910

fp32 LoRA support for Llama

Currently, TensorRT-LLM requires that LoRA weights dtype match the base model dtype. The check is here: https://github.com/NVIDIA/TensorRT-LLM/blob/9dbc5b38baba399c5517685ecc5b66f57a177a4c/cpp/tensor…

pankajroark updated 1 month ago
6
NVIDIA/TensorRT-LLM #1741

Quantizing Phi-3 128k Instruct to FP8 fails.

### System Info - GPU name: L40s - CUDA: 12.1 ``` Wed Jun 5 16:27:21 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.14 …

kalradivyanshu updated 2 months ago
10
Oufattole/meds-torch #19

Explore different numerical Value embedding strategies

We currently only support Continuous value embeddings (a one to many FFN). We should try other things, like supporting quantizing.

Oufattole updated 1 month ago
2
AutoGPTQ/AutoGPTQ #170

Why losses are higher for GPTJ than LLama?

I find that the quantisation losses are higher for GPTJ than LLama which seems to stay pretty low. ``` 2023-06-20 19:05:19 INFO [auto_gptq.modeling._base] Quantizing attn.q_proj in layer 2/28... …

ri938 updated 2 weeks ago
1
google-coral/coralmicro #85

Quantizing Model Input

### Description Hello, I have noticed that the examples (e.g. 'detect_objects_file' and 'classify_images_file') do not quantize the input tensor, read from a .rgb file, before running inference. I…

aabboud-logi updated 10 months ago
1
NVIDIA/TensorRT-LLM #1546

[Question] How to solve OOM problem when quantizing model wi…

Hi, @byshiue I wanna to quantize a llama model with long sequence 120K+， but an OOM Error raised. So I hope to solve OOM problem with multi gpus when quantizing model in convert_checkpoint.py.…

1649759610 updated 3 months ago
1
neuralmagic/AutoFP8 #36

CUDA out of memory when quantizing llama3.1-405b on 80GiBx8 …

``` Some parameters are on the meta device device because they were offloaded to the cpu. Quantizing weights: 0%| | 0/1771 [00:00

sfc-gh-zhwang updated 1 month ago
2
majianjia/nnom #52

Quantizing Accuracy consideration.

Hi @majianjia. Thank you for your quick response everytime. I have had accuracy test of my model using your framework. It had got 99.2% using by caffe framework, but in nnom, it dropped to 95%. Is…

Alan-Turing-Ko updated 5 years ago
9

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for quantizing

1000+ results
for quantizing