model-quantization Search Results

1000+ results
for model-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

descawed/galsdk #27

Model editing and importing

I've begun work on the ability to edit models and import new models. Here are the remaining features I'd like to complete: - [x] Write out model files - necessary for everything else - [ ] Model s…

descawed updated 5 days ago
1
vllm-project/vllm #9324

[Feature]: Quantization support for LLaVA OneVision

### 🚀 The feature, motivation and pitch I'm working on applications that must run locally in resource-limited HW. Threrefore, quantization becomes essential. Such applications need from multimodal vi…

salvaba94 updated 1 month ago
2
openvinotoolkit/nncf #2766

[TorchFX] Torch FX/PyTorch 2 Export Quantization

### 🚀 Feature request Quantization is a widely used technique to accelerate models, particularly when using the [torch.compile](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.htm…

alexsu52 updated 3 days ago
4
NVIDIA/TensorRT #4212

stable diffusion quantization in inpainting task is poor

i have completed stable diffusion quantization in txt2img as demo shows. the result is very good. when i want to transfer sd quantization in inpainting task, i meet the problem that the quantization r…

worhar updated 2 weeks ago
3
webmachinelearning/webnn #779

Support block-wise quantization

[Block-wise quantization](https://arxiv.org/abs/2110.02861) divides input tensors into smaller blocks that are independently quantized, resulting in faster optimization and high precision quantization…

huningxin updated 3 weeks ago
1
google-ai-edge/ai-edge-torch #391

error with concatenation when converting QAT model to tflite…

### 1. System information - Windows 11 - TensorFlow installation (pip package or built from source): pip - TensorFlow library : 2.13 I am attempting to convert a QAT model trained with int8 we…

gaikwadrahul8 updated 21 hours ago
1
unslothai/unsloth #1310

what was the quantisation algorithm used in unsloth/Llama-3.…

what was the quantisation algorithm used in unsloth/Llama-3.2-1B-bnb-4bit model: https://huggingface.co/docs/transformers/main/en/quantization/overview. Is it int4_awq or int4_weightonly ?

jayakommuru updated 1 week ago
1
NVIDIA/TensorRT-LLM #2392

Qwen2-72B w4a8 empty output

### System Info GPU: 4090 Tensorrt: 10.3 tensorrt-llm: 0.13.0.dev2024081300 ### Who can help? @Tracin May you please have a look, thank you very much ### Information - [ ] The official example sc…

lishicheng1996 updated 8 hours ago
5
pytorch/torchchat #1325

RFC: Quantization Evaluation

### 🚀 The feature, motivation and pitch With a single command, quantize the same model across every available quant scheme and configuration and output a table that compares the results. This will …

byjlw updated 4 weeks ago
1
THUDM/CogVideo #509

ImportError: cannot import name 'ActivationCasting' from 'to…

### System Info / 系統信息 torch 2.5.1+cu121 diffusers 0.31.0 torchao 0.7.0+cpu Python 3.11.10 Windows 11 ### Information / 问题信息 - [X] The official example scr…

nitinmukesh updated 1 week ago
1

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for model-quantization

1000+ results
for model-quantization