auto-quant Search Results

1000+ results
for auto-quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

neuralmagic/AutoFP8 #32

[Feature] ADD Support for DeepSeek-V2-Chat

OOM occurs when quantifying DeepSeek model on 8XA800。 The code used comes from https://github.com/neuralmagic/AutoFP8/issues/29 ``` from datasets import load_dataset from transformers import Aut…

Xu-Chen updated 4 months ago
1
bitsandbytes-foundation/bitsandbytes #881

Some modules are dispatched on the CPU or the disk. Make sur…

ValueError: Some modules are dispatched on the CPU or the disk. Make sure you have enough GPU RAM to fit the quantized mode l. If you want to dispatc…

Excy-an updated 1 month ago
11
PaddlePaddle/Paddle #62618

新IR Python API适配升级（第三期）

### 一、BackGround 📚 任务背景、任务修改内容、提交样例可参考前期已发布过的任务：https://github.com/PaddlePaddle/Paddle/issues/58067 ### 二、Task 📚 | 序号 | Python API | 所在文件 …

YuanRisheng updated 3 months ago
7
vllm-project/vllm #10156

[Bug]: Unable to load Llama-3.1-70B-Instruct using either `v…

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A…

SMAntony updated 6 days ago
18
vllm-project/vllm #9363

[Bug]: Qwen2-VL-72B Inference on Multiple-GPUs

### Your current environment The output of `python collect_env.py` ```text PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A…

bhupendra1324 updated 3 weeks ago
6
PaddlePaddle/PaddleDetection #9202

RuntimeError: (NotFound) Cannot open file C:\Users\Abc/.cach…

### 问题确认 Search before asking - [X] 我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues)，没有发现相似的bug。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issu…

DazzlingGalaxy updated 1 day ago
6
vllm-project/vllm #2714

Output Garbage Text in Mixtral 8x7b Post Upgrade to 0.3.0

I recently upgraded my deployment from version 0.2.7 to 0.3.0 for a mixtral-8x7b architecture model and have encountered a significant issue where the model outputs completely garbled data post-upgr…

44670 updated 2 weeks ago
9
Insoumis/analysons-lepen #9

Sécurité

QUELQUES PROPOSITIONS DU FRONT NATIONAL → [12] Rétablir la sécurité en veillant à la protection des libertés individuelles. → [13] Réarmer massivement les forces de l’ordre : en personnels (plan d…

Liamtaro updated 7 years ago
6
mlc-ai/mlc-llm #2549

Qwen2-72B-Instruct MultiGPU 8xP100

## 🐛 Bug The just released Qwen2 has the same architecture as the previous Qwen1.5, so theoretically it should be able to run directly. In fact, the model was quantized and compiled without errors.…

alphaarea updated 4 months ago
7
ultralytics/ultralytics #2756

Nothing works, can't be installed and can't be run

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. ### YOLOv8 Component _No response_ …

easy-and-simple updated 4 weeks ago
35

上一页 1...74 75 76 77 78 79 80...100 下一页

1000+ results for auto-quant

1000+ results
for auto-quant