network-quantization Search Results

1000+ results
for network-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

qwopqwop200/GPTQ-for-LLaMa #236

6-bit quantization

For smaller models, quantization causes more quality loss than large models. Could the repository try 6-bit / 128 groups for stuff like LLaMa-7B? This could be most useful for some of the smaller lang…

philipturner updated 1 year ago
1
NVIDIA/MinkowskiEngine #478

ME.cat() return SparseTensors must have the same coordinate_…

I have set input sparsetensor using same coordinate_manager ` sinput1 = ME.SparseTensor(features=input_dict['sinput_s_F'].to(self.device), coordinates=input_…

Yeah2333 updated 2 years ago
1
NeublaCorp/DeepCABAC #1

DeepCABAC

- Paper: [S. Wiedemann et al., "DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks," IEEE Journal of selected topics in signal processing, May 2020.](https://arxiv.org/pdf/1905.08…

jihunoh-neubla updated 2 years ago
5
google-research/google-research #1350

Question for the clip value of MobileBERT quantization

why the clip range is (-6, 60) in mobilebert quantization? https://github.com/google-research/google-research/blob/ba74f16e2e193f62133faf73a06e7f0792d42681/mobilebert/modeling.py#L1135 The comme…

LiuChiachi updated 2 years ago
1
dusty-nv/jetson-containers #598

got error when run VILA1.5-3b in jetson orin nano with 8gb R…

jetson@ubuntu:~$ jetson-containers run $(autotag nano_llm) python3 -m nano_llm.chat --api=mlc --model Efficient-Large-Model/VILA1.5-3b --max-context-len 256 --max-new-tokens 32 --pro…

thalapandi updated 2 months ago
1
ggerganov/ggml #208

Can GGML apply in CNN or RCNN ?

This quantization scheme can speed up the inference of neural networks, but there still is less example in CNN or RCNN, even RNN. Is these ones not easy on ggml or other reasons?

znsoftm updated 1 year ago
2
huggingface/accelerate #3197

using deepspeed original json config, when using bf16, get t…

### System Info ```Shell - `Accelerate` version: 1.0.0 - Platform: Linux-6.8.0-47-generic-x86_64-with-glibc2.35 - `accelerate` bash location: /home/user/anaconda3/envs/accelerate_multi/bin/accelera…

PMPBinZhang updated 9 hours ago
1
tiny-dnn/tiny-dnn #202

Quantization method for conv, deconv and fc layers.

## Quantization Method for conv, deconv and fc Layers. Here I want to implement the quanzization on operation in conv, deconv and fc layers. Much quantization method are included in this paper: Ristr…

wangyida updated 7 years ago
49
NVIDIA/TensorRT #4002

PTQ support for ViT models

## Description I am trying to figure out if TensoRT and the `pytorch_quantization` module support post-training quantization for vision transformers. The following piece of code follows the `pyt…

RuRo updated 3 months ago
4
exo-lang/exo #504

Try scheduling FlashAttention for "Chexo"

Many neural network optimization and quantization methods may be a really good motivating example for "Chexo" because we probably never want to reason about the soundness of their numerical stability,…

yamaguchi1024 updated 1 year ago
1

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for network-quantization

1000+ results
for network-quantization