auto-quant Search Results

1000+ results
for auto-quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

THUDM/CogVLM2 #177

raise RuntimeError("No GPU found. A GPU is needed for quanti…

### System Info / 系統信息 accelerate 0.33.0 aiofiles 23.2.1 annotated-types 0.7.0 anyio …

gg22mm updated 3 months ago
1
sgl-project/sglang #1479

[Bug] Deepseek-V2.5 capture cuda graph failed

### Checklist - [X] 1. I have searched related issues but cannot get the expected help. - [X] 2. The bug has not been fixed in the latest version. - [X] 3. Please note that if the bug-related issue y…

halexan updated 1 month ago
2
modelscope/ms-swift #1425

微调量化后的glm-4v-9b模型时报错

`CUDA_VISIBLE_DEVICES=0 swift sft --model_type glm4v-9b-chat --model_id_or_path /content/glm-4v-9b-4-bits --dataset /content/drive/MyDrive/glm/training_data.jsonl --output_dir /content/drive/MyDrive/g…

MisakaMikoto-o updated 3 months ago
4
vllm-project/vllm #6278

[Bug]: Failed to load `Qwen2-57B-A14B-Instruct-GPTQ-Int4` wi…

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC …

CrazyboyQCD updated 4 months ago
2
huggingface/accelerate #2789

GPU Memory Imbalance and OOM Errors During Training

### System Info ```Shell - `Accelerate` version: 0.30.0 - Platform: Linux-6.5.0-27-generic-x86_64-with-glibc2.35 - `accelerate` bash location: /data/envs/tt/bin/accelerate - Python version: 3.10.1…

DONGRYEOLLEE1 updated 3 months ago
20
vllm-project/llm-compressor #109

Struggling to quantize Llama-3.1-70b - OOM/linalg.Cholesky …

**Describe the bug** Hello the vLLM team, thank you for your outstanding work. I think llm-compressor is really filling a need : a one simple unified quant franework for vLLM. So the bug I am enc…

lulmer updated 2 months ago
3
pytorch-labs/float8_experimental #314

[RFC] Float8 Inference

# RFC: Float8 Inference - status: draft ## Objective We want to provide an easy mechanism to utilize FP8 in inference, and see both decreased memory usage and performance gains on hardware that…

drisspg updated 3 months ago
8
ModelCloud/GPTQModel #48

[FEATURE] DeepSeek V2 Chat Support

**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] **Describe the solution you'd like**…

Xu-Chen updated 4 months ago
6
neuralmagic/AutoFP8 #29

DeepSeek-Coder-V2-Lite-Instruct not working when quantized t…

Hi ! I quantized DeepSeek-Coder-V2-Lite-Instruct to FP8 using AutoFP8 but when I try to run it with vLLM I get the following error : **RuntimeError: "cat_cuda" not implemented for 'Float8_e4m3fn…

Syst3m1cAn0maly updated 4 months ago
10
QwenLM/Qwen2.5 #328

我想通过gptq量化qwen-moe-a2.7b。但是好像不支持，请问官方怎么量化的。

Traceback (most recent call last): File "/home/admin/workspace/aop_lab/app_source/run_gptq.py", line 89, in model = AutoGPTQForCausalLM.from_pretrained(args.model_name_or_path, quantize_confi…

wellcasa updated 3 months ago
3

上一页 1...88 89 90 91 92 93 94...100 下一页

1000+ results for auto-quant

1000+ results
for auto-quant